RE: [PATCH 1/2] hw/virtio/vhost-user: don't use uninitialized variable

2022-05-31 Thread Liu, Changpeng


> -Original Message-
> From: Alex Bennée 
> Sent: Tuesday, May 31, 2022 10:46 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org
> Subject: Re: [PATCH 1/2] hw/virtio/vhost-user: don't use uninitialized 
> variable
> 
> 
> Changpeng Liu  writes:
> 
> > Variable `vdev` in `struct vhost_dev` will not be ready
> > until start the device, so let's not use it for the error
> > output here.
> 
> This seems to be one of the areas where vhost_user_backend_dev_init and
> vhost_dev_init do things differently. Is there any particular reason why
> we couldn't initialise hdev->vdev consistently at init time?
vhost_dev_init() set hdev->vdev to NULL, and vhost_dev_start() set it to 
VirtIODevice,
it's consistent, they are common APIs designed for vhost-kernel and vhost-user.
> 
> >
> > Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when
> supported")
> >
> > Signed-off-by: Changpeng Liu 
> > ---
> >  hw/virtio/vhost-user.c | 8 +++-
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index b040c1ad2b..0594178224 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -2031,18 +2031,16 @@ static int vhost_user_backend_init(struct
> vhost_dev *dev, void *opaque,
> >  if (supports_f_config) {
> >  if (!virtio_has_feature(protocol_features,
> >  VHOST_USER_PROTOCOL_F_CONFIG)) {
> > -error_setg(errp, "vhost-user device %s expecting "
> > +error_setg(errp, "vhost-user device expecting "
> > "VHOST_USER_PROTOCOL_F_CONFIG but the vhost-user
> backend does "
> > -   "not support it.", dev->vdev->name);
> > +   "not support it.");
> >  return -EPROTO;
> >  }
> >  } else {
> >  if (virtio_has_feature(protocol_features,
> > VHOST_USER_PROTOCOL_F_CONFIG)) {
> >  warn_reportf_err(*errp, "vhost-user backend supports "
> > - "VHOST_USER_PROTOCOL_F_CONFIG for "
> > - "device %s but QEMU does not.",
> > - dev->vdev->name);
> > + "VHOST_USER_PROTOCOL_F_CONFIG but QEMU 
> > does not.");
> >  protocol_features &= ~(1ULL << 
> > VHOST_USER_PROTOCOL_F_CONFIG);
> >  }
> >  }
> 
> 
> --
> Alex Bennée


RE: [PATCH] vhost-user-scsi: implement handle_output

2019-10-21 Thread Liu, Changpeng
There is some logic in vhost_user_blk_handle_output() for now, it's not empty 
as vhost-user-scsi.
There should be other issue if it can't start from SeaBIOS.

> -Original Message-
> From: Felipe Franciosi [mailto:fel...@nutanix.com]
> Sent: Monday, October 21, 2019 4:00 PM
> To: Yongji Xie ; Liu, Changpeng
> 
> Cc: Stefan Hajnoczi ; Alex Williamson
> ; Dr . David Alan Gilbert ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH] vhost-user-scsi: implement handle_output
> 
> 
> 
> > On Oct 21, 2019, at 5:01 AM, Yongji Xie  wrote:
> >
> > On Fri, 18 Oct 2019 at 19:14, Felipe Franciosi  wrote:
> >>
> >>
> >>
> >>> On Oct 18, 2019, at 3:59 AM, Yongji Xie  wrote:
> >>>
> >>> On Fri, 18 Oct 2019 at 01:17, Felipe Franciosi  wrote:
> >>>>
> >>>> Originally, vhost-user-scsi did not implement a handle_output callback
> >>>> as that didn't seem necessary. Turns out it is.
> >>>>
> >>>> Depending on which other devices are presented to a VM, SeaBIOS may
> >>>> decide to map vhost-user-scsi devices on the 64-bit range of the address
> >>>> space. As a result, SeaBIOS will kick VQs via the config space. Those
> >>>> land on Qemu (not the vhost backend) and are missed, causing the VM not
> >>>> to boot. This fixes the issue by getting Qemu to post the notification.
> >>>>
> >>> Should we fix this in vhost-user-blk too?
> >>
> >> I'm not sure vhost-user-blk suffers from the same problem. Certainly
> >
> > Actually I found vhost-user-blk has the same problem in a mutilple
> > GPUs passthough environment.
> 
> Let's Cc Changpeng for comments. I'm not familiar with that code.
> 
> In any case, I still think we should merge this and fix other
> implementations separately. That allows us to revert patches
> individually if anything else breaks.
> 
> F.
> 
> >
> > Thanks,
> > Yongji




Re: [Qemu-devel] [PATCH v3] virtio-blk: set correct config size for the host driver

2019-02-12 Thread Liu, Changpeng



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, February 12, 2019 11:11 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; stefa...@redhat.com; sgarz...@redhat.com;
> dgilb...@redhat.com; ldok...@redhat.com
> Subject: Re: [PATCH v3] virtio-blk: set correct config size for the host 
> driver
> 
> On Tue, Feb 12, 2019 at 11:19:49PM +0800, Changpeng Liu wrote:
> > Commit caa1ee43 "vhost-user-blk: add discard/write zeroes features
> > support" added fields to struct virtio_blk_config. This changes
> > the size of the config space and breaks migration from QEMU 3.1
> > and older:
> >
> > qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x10 read: 41
> device: 1 cmask: ff wmask: 80 w1cmask:0
> > qemu-system-ppc64: Failed to load PCIDevice:config
> > qemu-system-ppc64: Failed to load virtio-blk:virtio
> > qemu-system-ppc64: error while loading state for instance 0x0 of device
> 'pci@8002000:01.0/virtio-blk'
> > qemu-system-ppc64: load of migration failed: Invalid argument
> >
> > Since virtio-blk doesn't support the "discard" and "write zeroes"
> > features, it shouldn't even expose the associated fields in the
> > config space actually. Just include all fields up to num_queues to
> > match QEMU 3.1 and older.
> >
> > Signed-off-by: Changpeng Liu 
> 
> OK almost.
> 
> > ---
> >  hw/block/virtio-blk.c | 14 ++
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index 9a87b3b..0ff5315 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -28,6 +28,10 @@
> >  #include "hw/virtio/virtio-bus.h"
> >  #include "hw/virtio/virtio-access.h"
> >
> > +/* We don't support discard yet, hide associated config fields. */
> > +#define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \
> > + max_discard_sectors)
> > +
> >  static void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
> >  VirtIOBlockReq *req)
> >  {
> > @@ -761,7 +765,9 @@ static void virtio_blk_update_config(VirtIODevice
> *vdev, uint8_t *config)
> >  blkcfg.alignment_offset = 0;
> >  blkcfg.wce = blk_enable_write_cache(s->blk);
> >  virtio_stw_p(vdev, _queues, s->conf.num_queues);
> > -memcpy(config, , sizeof(struct virtio_blk_config));
> > +memcpy(config, , VIRTIO_BLK_CFG_SIZE);
> > +QEMU_BUILD_BUG_ON(VIRTIO_BLK_CFG_SIZE > sizeof(struct
> virtio_blk_config));
> > +
> 
> Oh probably sizeof blkcfg here, right?
> Also we don't need the empty line here.
Ok, removed the empty line and use sizeof(blkcfg) instead with v4.
> 
> >  }
> >
> >  static void virtio_blk_set_config(VirtIODevice *vdev, const uint8_t 
> > *config)
> > @@ -769,7 +775,8 @@ static void virtio_blk_set_config(VirtIODevice *vdev,
> const uint8_t *config)
> >  VirtIOBlock *s = VIRTIO_BLK(vdev);
> >  struct virtio_blk_config blkcfg;
> >
> > -memcpy(, config, sizeof(blkcfg));
> > +memcpy(, config, VIRTIO_BLK_CFG_SIZE);
> > +QEMU_BUILD_BUG_ON(VIRTIO_BLK_CFG_SIZE > sizeof(blkcfg));
> >
> >  aio_context_acquire(blk_get_aio_context(s->blk));
> >  blk_set_enable_write_cache(s->blk, blkcfg.wce != 0);
> > @@ -952,8 +959,7 @@ static void virtio_blk_device_realize(DeviceState *dev,
> Error **errp)
> >  return;
> >  }
> >
> > -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > -sizeof(struct virtio_blk_config));
> > +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, VIRTIO_BLK_CFG_SIZE);
> >
> >  s->blk = conf->conf.blk;
> >  s->rq = NULL;
> > --
> > 1.9.3



Re: [Qemu-devel] [PATCH v2] virtio-blk: set correct config size for the host driver

2019-02-12 Thread Liu, Changpeng



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, February 12, 2019 9:49 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; stefa...@redhat.com; sgarz...@redhat.com;
> dgilb...@redhat.com; ldok...@redhat.com
> Subject: Re: [PATCH v2] virtio-blk: set correct config size for the host 
> driver
> 
> On Tue, Feb 12, 2019 at 02:01:44PM +0800, Changpeng Liu wrote:
> > Commit caa1ee43 "vhost-user-blk: add discard/write zeroes features support"
> > introduced extra fields to existing struct virtio_blk_config, when
> > migration was executed from older QEMU version to current head, it
> > will break the migration.  While here, set the correct config size
> > when initializing the host driver, for now, discard/write zeroes
> > are not supported by virtio-blk host driver, so set the config
> > size as before, users can change config size when adding the new
> > feature bits support.
> >
> > Signed-off-by: Changpeng Liu 
> 
> 
> Pls rewrite commit log as suggested in this thread
> and repost.
Thanks, repost a v3 patch to address the commit message and your following 
comments.
> 
> > ---
> >  hw/block/virtio-blk.c | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index 9a87b3b..846b7b9 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -28,6 +28,9 @@
> >  #include "hw/virtio/virtio-bus.h"
> >  #include "hw/virtio/virtio-access.h"
> >
> > +#define VIRTIO_BLK_CFG_SIZE (offsetof(struct virtio_blk_config, num_queues)
> + \
> > + sizeof_field(struct virtio_blk_config, 
> > num_queues))
> > +
> 
> I would just do offsetof(max_discard_sectors)
> with a comment "we don't support discard yet, hide associated config
> fields".
> 
> >  static void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
> >  VirtIOBlockReq *req)
> >  {
> > @@ -761,7 +764,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev,
> uint8_t *config)
> >  blkcfg.alignment_offset = 0;
> >  blkcfg.wce = blk_enable_write_cache(s->blk);
> >  virtio_stw_p(vdev, _queues, s->conf.num_queues);
> > -memcpy(config, , sizeof(struct virtio_blk_config));
> > +memcpy(config, , VIRTIO_BLK_CFG_SIZE);
> 
> 
> Let's add QEMU_BUILD_BUG_ON(VIRTIO_BLK_CFG_SIZE <= sizeof(struct
> virtio_blk_config)) just for documentation purposes.
> 
> >  }
> >
> >  static void virtio_blk_set_config(VirtIODevice *vdev, const uint8_t 
> > *config)
> > @@ -769,7 +772,7 @@ static void virtio_blk_set_config(VirtIODevice *vdev,
> const uint8_t *config)
> >  VirtIOBlock *s = VIRTIO_BLK(vdev);
> >  struct virtio_blk_config blkcfg;
> >
> > -memcpy(, config, sizeof(blkcfg));
> > +memcpy(, config, VIRTIO_BLK_CFG_SIZE);
> 
> 
> Here too, QEMU_BUILD_BUG_ON(VIRTIO_BLK_CFG_SIZE <= sizeof(blkcfg))
> 
> >
> >  aio_context_acquire(blk_get_aio_context(s->blk));
> >  blk_set_enable_write_cache(s->blk, blkcfg.wce != 0);
> > @@ -952,8 +955,7 @@ static void virtio_blk_device_realize(DeviceState *dev,
> Error **errp)
> >  return;
> >  }
> >
> > -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > -sizeof(struct virtio_blk_config));
> > +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, VIRTIO_BLK_CFG_SIZE);
> >
> >  s->blk = conf->conf.blk;
> >  s->rq = NULL;
> > --
> > 1.9.3



Re: [Qemu-devel] [PATCH] virtio-blk: set correct config size for the host driver

2019-02-11 Thread Liu, Changpeng



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, February 12, 2019 12:31 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; stefa...@redhat.com; sgarz...@redhat.com;
> dgilb...@redhat.com; ldok...@redhat.com
> Subject: Re: [PATCH] virtio-blk: set correct config size for the host driver
> 
> On Tue, Feb 12, 2019 at 12:25:16PM +0800, Changpeng Liu wrote:
> > Commit caa1ee43 "vhost-user-blk: add discard/write zeroes features support"
> > introduced extra fields to existing struct virtio_blk_config, when
> > migration was executed from older QEMU version to current head, it
> > will break the migration.  While here, set the correct config size
> > when initializing the host driver, for now, discard/write zeroes
> > are not supported by virtio-blk host driver, so set the config
> > size as before, users can change config size when adding the new
> > feature bits support.
> >
> > Signed-off-by: Changpeng Liu 
> > ---
> >  hw/block/virtio-blk.c  | 15 +++
> >  include/hw/virtio/virtio-blk.h |  1 +
> >  2 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index 9a87b3b..fac5d33 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -761,7 +761,7 @@ static void virtio_blk_update_config(VirtIODevice
> *vdev, uint8_t *config)
> >  blkcfg.alignment_offset = 0;
> >  blkcfg.wce = blk_enable_write_cache(s->blk);
> >  virtio_stw_p(vdev, _queues, s->conf.num_queues);
> > -memcpy(config, , sizeof(struct virtio_blk_config));
> > +memcpy(config, , s->config_size);
> >  }
> >
> >  static void virtio_blk_set_config(VirtIODevice *vdev, const uint8_t 
> > *config)
> > @@ -769,7 +769,7 @@ static void virtio_blk_set_config(VirtIODevice *vdev,
> const uint8_t *config)
> >  VirtIOBlock *s = VIRTIO_BLK(vdev);
> >  struct virtio_blk_config blkcfg;
> >
> > -memcpy(, config, sizeof(blkcfg));
> > +memcpy(, config, s->config_size);
> >
> >  aio_context_acquire(blk_get_aio_context(s->blk));
> >  blk_set_enable_write_cache(s->blk, blkcfg.wce != 0);
> > @@ -847,6 +847,13 @@ static void virtio_blk_set_status(VirtIODevice *vdev,
> uint8_t status)
> >  }
> >  }
> >
> > +static void virtio_blk_set_config_size(VirtIOBlock *s)
> > +{
> > +/* VIRTIO_BLK_F_MQ is supported by host driver */
> 
> It can be disabled though.
It means the host driver can support this feature, users can use it or not.
So the config_size is same with previous old version.
> 
> It just so happens that only the addition of max_discard_seg
> crosses the next power of 2 boundary.
> 
> 
> > +s->config_size = offsetof(struct virtio_blk_config, num_queues) +
> > + sizeof_field(struct virtio_blk_config, num_queues);
> > +}
> > +
> >  static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
> >  {
> >  VirtIOBlock *s = VIRTIO_BLK(vdev);
> > @@ -952,8 +959,8 @@ static void virtio_blk_device_realize(DeviceState *dev,
> Error **errp)
> >  return;
> >  }
> >
> > -virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > -sizeof(struct virtio_blk_config));
> > +virtio_blk_set_config_size(s);
> > +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, s->config_size);
> >
> >  s->blk = conf->conf.blk;
> >  s->rq = NULL;
> > diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> > index 5117431..9181a93 100644
> > --- a/include/hw/virtio/virtio-blk.h
> > +++ b/include/hw/virtio/virtio-blk.h
> > @@ -51,6 +51,7 @@ typedef struct VirtIOBlock {
> >  void *rq;
> >  QEMUBH *bh;
> >  VirtIOBlkConf conf;
> > +size_t config_size;
> >  unsigned short sector_mask;
> >  bool original_wce;
> >  VMChangeStateEntry *change;
> 
> Well assuming you are looking for a minimal change,
> go further and drop config_size completely,
> replace with a macro.
OK.
> 
> 
> > --
> > 1.9.3



Re: [Qemu-devel] virtio-blk io bar size changed

2019-02-11 Thread Liu, Changpeng


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, February 12, 2019 1:23 AM
> To: Dr. David Alan Gilbert 
> Cc: qemu-devel@nongnu.org; Liu, Changpeng ;
> stefa...@redhat.com; ldok...@redhat.com
> Subject: Re: virtio-blk io bar size changed
> 
> On Mon, Feb 11, 2019 at 04:58:19PM +, Dr. David Alan Gilbert wrote:
> > Hi,
> >   Lukáš reported that there's a migration breakage between 3.1 and
> > current head with virtio-blk;   it looks like the io bar changes from 64
> > to 128 bytes and my bisect suggests it's:
> >
> > commit caa1ee43131c060347b32893abd41fe4865eaa2e (HEAD, refs/bisect/bad)
> > Author: Changpeng Liu 
> > Date:   Wed Jan 16 13:19:30 2019 +0800
> >
> > vhost-user-blk: add discard/write zeroes features support
> >
> >
> > How do we fix that?
> >
> > Dave
> 
> Quickly :)
> 
> See the logic around virtio_net_set_config_size
> and static VirtIOFeature feature_sizes - we need to
> add something similar to virtio blk now.
Correct, after this commit the BAR config size is changed, I will submit a 
patch to fix this.
> 
> 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK


Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-29 Thread Liu, Changpeng



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, January 30, 2019 3:40 PM
> To: Michael S. Tsirkin 
> Cc: Stefan Hajnoczi ; Thomas Huth ;
> Stefano Garzarella ; Liu, Changpeng
> ; Laurent Vivier ; Kevin Wolf
> ; qemu-bl...@nongnu.org; qemu-devel@nongnu.org; Max
> Reitz ; Paolo Bonzini ; virtio-
> comm...@lists.oasis-open.org
> Subject: Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for
> WRITE_ZEROES command
> 
> On Sun, Jan 27, 2019 at 01:42:03PM -0500, Michael S. Tsirkin wrote:
> > On Sun, Jan 27, 2019 at 12:57:20PM +, Stefan Hajnoczi wrote:
> > > On Fri, Jan 25, 2019 at 02:17:01PM -0500, Michael S. Tsirkin wrote:
> > > > On Fri, Jan 25, 2019 at 03:12:45PM +, Stefan Hajnoczi wrote:
> > > > > Based on the Linux guest driver code and the lack of more evidence in
> > > > > the spec, I'm pretty sure data[] doesn't need to be padded to 512 
> > > > > bytes
> > > > > for discard/write zero requests.
> > > >
> > > > OK. Must devices support such padding?
> > >
> > > I see no reason to require padding.  Host APIs and physical storage
> > > controllers do not require it.
> >
> > I'm not talking about requiring padding I'm talking about
> > supporting padding on device side.
> >
> > One reason would be ease of extending the spec for guest.
> >
> > E.g. imagine adding more fields to the command.
> > With suppport for padding guest simply adds fields
> > to its structure, and the unused ones are ignored
> > by device. Without support for padding you need two
> > versions of the structure.
> 
> The simplest way is to say each struct virtio_blk_discard_write_zeroes
> (currently 16 bytes long) is padded to 512 bytes, but this wastes a lot
> of memory.
> 
> Stefano and I looked at the status of multiple segment discard/write
> zeroes in Linux: only the NVMe driver supports multiple segments.  Even
> SCSI sd doesn't support multiple segments.
> 
> Userspace APIs do not offer a way for applications to submit multiple
> segments in a single call.  Only the block I/O scheduler creates
> requests with multiple segments.
> 
> SPDK's virtio-blk driver and vhost-user-blk device backend also only
> support 1 segment per request.
> 
> Given this situation, I'm not sure it's worth the complexity to spec out
> a fancy padding scheme that no one will implement.  Wasting 512 - 16
> bytes per segment also seems unattractive.  IMO it's acceptable to
> introduce a feature bit if struct virtio_blk_discard_write_zeroes needs
> extension in the future.
> 
> > Another reason would be compatibility.  For better or worse the spec
> > does make it look like 512 padding is present. This patch was merged very
> > recently so I agree it's not too late to clarify it that it's not
> > required, but it seems a bit too much to disallow that completely.
> > Let's assume for a minute a device turns up that already
> > implemented the 512 byte assumption? If padding is allowed then we can
> > write a driver that works with both devices.
> 
> The Linux guest driver doesn't honor 512 byte padding, so device
> backends would already need to make an exception if we spec 512 byte
> padding.
> 
> Let's begin VIRTIO 1.1 with the state of real-world implementations:
> data[] must be a multiple of sizeof(struct
> virtio_blk_discard_write_zeroes).
Actually that's the purpose at the first beginning, padding to 512 bytes will 
waste some memory space, 
it's nice to have an exception other Read/Write commands. 
> 
> I'll send a patch to the virtio TC and we can discuss further in that
> thread.
> 
> Stefan



Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-25 Thread Liu, Changpeng


> -Original Message-
> From: Thomas Huth [mailto:th...@redhat.com]
> Sent: Friday, January 25, 2019 4:49 PM
> To: Stefano Garzarella ; Michael S. Tsirkin
> ; Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; Laurent Vivier ; Kevin Wolf
> ; qemu-bl...@nongnu.org; Max Reitz
> ; Stefan Hajnoczi ; Paolo Bonzini
> 
> Subject: Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for
> WRITE_ZEROES command
> 
> On 2019-01-25 09:16, Stefano Garzarella wrote:
> > On Fri, Jan 25, 2019 at 07:07:35AM +0100, Thomas Huth wrote:
> >> On 2019-01-25 07:01, Thomas Huth wrote:
> >>> On 2019-01-24 18:23, Stefano Garzarella wrote:
> >>>> If the WRITE_ZEROES feature is enabled, we check this
> >>>> command in the test_basic().
> >>>>
> >>>> Signed-off-by: Stefano Garzarella 
> >>>> ---
> >>>>  tests/virtio-blk-test.c | 63 +
> >>>>  1 file changed, 63 insertions(+)
> >>>>
> >>>> diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
> >>>> index 04c608764b..8cabbcb85a 100644
> >>>> --- a/tests/virtio-blk-test.c
> >>>> +++ b/tests/virtio-blk-test.c
> >>>> @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev,
> QGuestAllocator *alloc,
> >>>>
> >>>>  guest_free(alloc, req_addr);
> >>>>
> >>>> +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
> >>>> +struct virtio_blk_discard_write_zeroes *dwz_hdr;
> >>>> +void *expected;
> >>>> +
> >>>> +/*
> >>>> + * WRITE_ZEROES request on the same sector of previous test 
> >>>> where
> >>>> + * we wrote "TEST".
> >>>> + */
> >>>> +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
> >>>> +req.data = g_malloc0(512);
> >>>
> >>> Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> >>> something similar here, to see whether zeroes or 0xaa is written?
> >>
> >> Ah, never mind, I thought req.data would be a sector buffer here, but
> >> looking at the lines below, it apparently is something different.
> >>
> >> Why do you allocate 512 bytes here? I'd rather expect
> >> g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
> >> then you could also use a local "struct virtio_blk_discard_write_zeroes
> >> dwz_hdr" variable instead of a pointer, and drop the g_malloc0() 
> >> completely?
> >>
> >
> > Hi Thomas,
> > it was my initial implementation, but on the first test I discovered
> > that virtio_blk_request() has an assert on the data_size and it requires
> > a multiple of 512 bytes.
> > Then I looked at the virtio-spec #1, and it seems that data should be
> > multiple of 512 bytes also if it contains the struct
> > virtio_blk_discard_write_zeroes. (I'm not sure)
> >
> > Anyway I tried to allocate only the space for that struct, commented the
> > assert and the test works well.
> >
> > How do you suggest to proceed?
> 
> Wow, that's a tough question. Looking at the virtio spec, I agree with
> you, it looks like struct virtio_blk_discard_write_zeroes should be
> padded to 512 bytes here. But when I look at the Linux sources
> (drivers/block/virtio_blk.c), I fail to see that they are doing the
> padding there (but maybe I'm just too blind).
> 
> Looking at the QEMU sources, it seems like it can deal with both and
> always sets the status right behind the last byte:
> 
> req->in = (void *)in_iov[in_num - 1].iov_base
>   + in_iov[in_num - 1].iov_len
>   - sizeof(struct virtio_blk_inhdr);
> 
> Anyway, I think the virtio spec should be clearer here to avoid bad
> implementations in the future, so maybe Changpeng or Michael could
> update the spec here a little bit?
The data for Discard and Write Zeroes commands are struct 
virtio_blk_discard_write_zeroes
aligned, that means you can pass 16 bytes aligned data, based on the segments 
number supported,
this is also aligned with NVMe specification and  the SCSI specification.
> 
>  Thomas
> 
> 
> > [1](https://github.com/oasis-tcs/virtio-spec/blob/master/content.tex#L3944)
> >
> >
> > Thanks,
> > Stefano
> >



Re: [Qemu-devel] [PATCH] vhost-user-blk: enable discard/write zeroes features

2019-01-14 Thread Liu, Changpeng



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, January 14, 2019 6:42 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; stefa...@redhat.com; m...@redhat.com;
> sgaz...@redhat.com
> Subject: Re: [Qemu-devel] [PATCH] vhost-user-blk: enable discard/write zeroes
> features
> 
> On Mon, Jan 14, 2019 at 03:35:17PM +0800, Changpeng Liu wrote:
> > Linux commit 1f23816b8 "virtio_blk: add discard and write zeroes support"
> > added the support in the Guest kernel, while here enable the feature bits
> > support with vhost-user-blk driver.  Also enable the test example utility
> > with DISCARD command support.
> 
> The commit message mentions write zeroes but this patch only covers
> discard.  Will you send a separate patch for write zeros?
Not really, I don't have such plan at first, I can add it in next version.
> 
> > Signed-off-by: Changpeng Liu 
> 
> CCed Stefano, who is working on hw/block/virtio-blk.c emulation support.
> 
> > @@ -157,6 +161,29 @@ vub_writev(VubReq *req, struct iovec *iov, uint32_t
> iovcnt)
> >  return rc;
> >  }
> >
> > +static int
> > +vub_discard(VubReq *req, struct iovec *iov, uint32_t iovcnt)
> > +{
> > +if (iovcnt != 1) {
> 
> This is a virtio specification violation.  The iovec layout is
> intentionally not part of the specification, leaving the guest driver
> free to choose its preferred layout.
> 
> The device backend must accept any layout, including splitting a struct
> across iovecs or even many small iovecs of just 1 byte!
I see, the original intention here is just using 1 descriptor, which is 16 
bytes.
> 
> > +fprintf(stderr, "Invalid Discard IOV count\n");
> > +return -1;
> > +}
> > +
> > +#if defined(__linux__) && defined(BLKDISCARD)
> > +VubDev *vdev_blk = req->vdev_blk;
> > +struct virtio_blk_discard_write_zeroes *desc =
> > +   (struct virtio_blk_discard_write_zeroes 
> > *)iov->iov_base;
> 
> Missing input size check.  Even if this example isn't used in
> production, it's important to validate inputs since other people will
> implement their vhost-user backend based on this example.
> 
> Please check that sizeof(*desc) bytes are really available before
> accessing it.
Ok, will fix it.
> 
> > +case VIRTIO_BLK_T_DISCARD: {
> > +int rc;
> > +rc = vub_discard(req, >out_sg[1], out_num);
> > +if (rc == 0) {
> >  req->in->status = VIRTIO_BLK_S_OK;
> > -req->size = elem->in_sg[0].iov_len;
> > -vub_req_complete(req);
> > -break;
> > -}
> > -default: {
> > +} else {
> >  req->in->status = VIRTIO_BLK_S_UNSUPP;
> 
> Is there no IOERR case?  BLKDISCARD can probably fail due to an I/O
> error and that shouldn't be reported as UNSUPP.
Yes, should include both UNSUPP and ERR status here, will fix it.
> 
> > @@ -454,7 +492,7 @@ vub_get_blocksize(int fd)
> >
> >  #if defined(__linux__) && defined(BLKSSZGET)
> >  if (ioctl(fd, BLKSSZGET, ) == 0) {
> > -return blocklen;
> > +return blocksize;
> >  }
> >  #endif
> 
> Unrelated bug fix?  Please submit a separate patch.
Ok.
> 
> Do you know why the patchew, Travis, etc continuous integration systems
> didn't detect the compile error?  Please ensure that
> contrib/vhost-user-blk/ is covered by CI.
I don't include the header file before, so it will return hardcoded 512 bytes, 
which works for most devices.



Re: [Qemu-devel] [RFC v1] block/NVMe: introduce a new vhost NVMe host device to QEMU

2018-10-24 Thread Liu, Changpeng
The latest thread for supporting DISCARD and WRITE ZEROES feature for 
virtio-blk is here:

virtio_blk: add discard and write zeroes support
https://lists.linuxfoundation.org/pipermail/virtualization/2018-October/039534.html

I don't take further more steps about it, please go ahead and make the 
specification more clear, thanks.


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Wednesday, October 24, 2018 1:39 AM
> To: Paolo Bonzini 
> Cc: Liu, Changpeng ; qemu-devel@nongnu.org; Harris,
> James R ; Busch, Keith ;
> f...@redhat.com; stefa...@gmail.com; stefa...@redhat.com
> Subject: Re: [RFC v1] block/NVMe: introduce a new vhost NVMe host device to
> QEMU
> 
> On Tue, Jan 16, 2018 at 06:06:56PM +0100, Paolo Bonzini wrote:
> > Second, virtio-based vhost-user remains QEMU's preferred method for
> > high-performance I/O in guests.  Discard support is missing and that is
> > important for SSDs; that should be fixed in the virtio spec.
> 
> BTW could you reply on the thread of the patch
>   virtio_blk: add discard and write zeroes support
> 
> Christoph Hellwig thinks we should change the spec and defer
> implementation until we do. What's your take on this?
> 
> 
> --
> MST



Re: [Qemu-devel] [PATCH] vhost-blk: turn on pre-defined RO feature bit

2018-05-27 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Friday, May 25, 2018 6:20 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; m...@redhat.com
> Subject: Re: [PATCH] vhost-blk: turn on pre-defined RO feature bit
> 
> On Tue, May 22, 2018 at 02:39:28PM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > Sent: Tuesday, May 22, 2018 6:11 AM
> > > To: Liu, Changpeng <changpeng@intel.com>
> > > Cc: qemu-devel@nongnu.org; m...@redhat.com
> > > Subject: Re: [PATCH] vhost-blk: turn on pre-defined RO feature bit
> > >
> > > On Sat, May 19, 2018 at 06:20:16AM +0800, Changpeng Liu wrote:
> > > > Read only feature shouldn't be negotiable, because if the
> > > > backend device reported Read only feature supported, QEMU
> > > > host driver shouldn't change backend's RO attribute.
> > >
> > > I don't understand this patch.
> > >
> > > Does it make *all* virtio-blk devices read-only?
> > If the slave target reported RO feature, the disk in OS should be RO, 
> > currently
> > users should specify in QEMU command to enable RO.
> > >
> > > Or is the idea that the vhost-user slave has to clear the read-only bit
> > > if it is a writable device?
> > Exactly.
> 
> Thanks for explaining!
> 
> Please implement the read-only bit in contrib/vhost-user-blk we it can
> be tested.
Sounds good, will add the test together.
> 
> Thanks,
> Stefan



Re: [Qemu-devel] [PATCH] vhost-blk: turn on pre-defined RO feature bit

2018-05-22 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Tuesday, May 22, 2018 6:11 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; m...@redhat.com
> Subject: Re: [PATCH] vhost-blk: turn on pre-defined RO feature bit
> 
> On Sat, May 19, 2018 at 06:20:16AM +0800, Changpeng Liu wrote:
> > Read only feature shouldn't be negotiable, because if the
> > backend device reported Read only feature supported, QEMU
> > host driver shouldn't change backend's RO attribute.
> 
> I don't understand this patch.
> 
> Does it make *all* virtio-blk devices read-only?
If the slave target reported RO feature, the disk in OS should be RO, currently
users should specify in QEMU command to enable RO.
> 
> Or is the idea that the vhost-user slave has to clear the read-only bit
> if it is a writable device?
Exactly.
> 
> Stefan



Re: [Qemu-devel] [PATCH v3 2/2] vhost-user: back SET/GET_CONFIG requests with a protocol feature

2018-04-08 Thread Liu, Changpeng


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 29, 2018 3:53 PM
> To: m...@redhat.com; Liu, Changpeng <changpeng@intel.com>;
> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> Cc: Maxime Coquelin <maxime.coque...@redhat.com>
> Subject: [PATCH v3 2/2] vhost-user: back SET/GET_CONFIG requests with a
> protocol feature
> 
> Without a dedicated protocol feature, QEMU cannot know whether
> the backend can handle VHOST_USER_SET_CONFIG and
> VHOST_USER_GET_CONFIG messages.
> 
> This patch adds a protocol feature that is only advertised by
> QEMU if the device implements the config ops. Vhost user init
> fails if the device support the feature but the backend doesn't.
> 
> The backend should only send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
> requests if the protocol feature has been negotiated.
> 
> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
Acked-by: Changpeng Liu <changpeng@intel.com>
> ---
>  docs/interop/vhost-user.txt | 21 -
>  hw/virtio/vhost-user.c  | 22 ++
>  2 files changed, 34 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index c058c407df..534caab18a 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -379,6 +379,7 @@ Protocol features
>  #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
>  #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
>  #define VHOST_USER_PROTOCOL_F_PAGEFAULT  8
> +#define VHOST_USER_PROTOCOL_F_CONFIG 9
> 
>  Master message types
>  
> @@ -664,7 +665,8 @@ Master message types
>Master payload: virtio device config space
>Slave payload: virtio device config space
> 
> -  Submitted by the vhost-user master to fetch the contents of the virtio
> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> +  submitted by the vhost-user master to fetch the contents of the virtio
>device configuration space, vhost-user slave's payload size MUST match
>master's request, vhost-user slave uses zero length of payload to
>indicate an error to vhost-user master. The vhost-user master may
> @@ -677,7 +679,8 @@ Master message types
>Master payload: virtio device config space
>Slave payload: N/A
> 
> -  Submitted by the vhost-user master when the Guest changes the virtio
> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> +  submitted by the vhost-user master when the Guest changes the virtio
>device configuration space and also can be used for live migration
>on the destination host. The vhost-user slave must check the flags
>field, and slaves MUST NOT accept SET_CONFIG for read-only
> @@ -766,13 +769,13 @@ Slave message types
>   Slave payload: N/A
>   Master payload: N/A
> 
> - Vhost-user slave sends such messages to notify that the virtio device's
> - configuration space has changed, for those host devices which can 
> support
> - such feature, host driver can send VHOST_USER_GET_CONFIG message to
> slave
> - to get the latest content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is
> - negotiated, and slave set the VHOST_USER_NEED_REPLY flag, master must
> - respond with zero when operation is successfully completed, or non-zero
> - otherwise.
> + When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, vhost-user slave
> sends
> + such messages to notify that the virtio device's configuration space has
> + changed, for those host devices which can support such feature, host
> + driver can send VHOST_USER_GET_CONFIG message to slave to get the latest
> + content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, and slave
> set
> + the VHOST_USER_NEED_REPLY flag, master must respond with zero when
> + operation is successfully completed, or non-zero otherwise.
> 
>  VHOST_USER_PROTOCOL_F_REPLY_ACK:
>  ---
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 44aea5c0a8..38da8692bb 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -46,6 +46,7 @@ enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6,
>  VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
>  VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
> +VHOST_USER_PROTOCOL_F_CONFIG = 9,
>  VHOST_USER_PROTOCOL_F_MAX
>  };
> 
> @@ -1211,6 +1212,17 @@ static int vhost_user_init(struct vhost_dev *dev, void
> *opaque)
> 
>  dev->protocol_features =
>  protocol_features & VHO

Re: [Qemu-devel] [PATCH v3 1/2] vhost-user-blk: set config ops before vhost-user init

2018-04-08 Thread Liu, Changpeng


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 29, 2018 3:53 PM
> To: m...@redhat.com; Liu, Changpeng <changpeng@intel.com>;
> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> Cc: Maxime Coquelin <maxime.coque...@redhat.com>
> Subject: [PATCH v3 1/2] vhost-user-blk: set config ops before vhost-user init
> 
> As soon as vhost-user init is done, the backend may send
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, so let's set the
> notification callback before it.
> 
> Also, it will be used to know whether the device supports
> the config feature to advertize it or not.
> 
> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
Acked-by: Changpeng Liu <changpeng@intel.com>
> ---
>  hw/block/vhost-user-blk.c | 4 ++--
>  hw/virtio/vhost.c | 1 -
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index f840f07dfe..262baca432 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -259,6 +259,8 @@ static void vhost_user_blk_device_realize(DeviceState
> *dev, Error **errp)
>  s->dev.vq_index = 0;
>  s->dev.backend_features = 0;
> 
> +vhost_dev_set_config_notifier(>dev, _ops);
> +
>  ret = vhost_dev_init(>dev, >chardev, VHOST_BACKEND_TYPE_USER, 0);
>  if (ret < 0) {
>  error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> @@ -277,8 +279,6 @@ static void vhost_user_blk_device_realize(DeviceState
> *dev, Error **errp)
>  s->blkcfg.num_queues = s->num_queues;
>  }
> 
> -vhost_dev_set_config_notifier(>dev, _ops);
> -
>  return;
> 
>  vhost_err:
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 250f886acb..b6c314e350 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1451,7 +1451,6 @@ int vhost_dev_set_config(struct vhost_dev *hdev,
> const uint8_t *data,
>  void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
> const VhostDevConfigOps *ops)
>  {
> -assert(hdev->vhost_ops);
>  hdev->config_ops = ops;
>  }
> 
> --
> 2.14.3




Re: [Qemu-devel] [PATCH 1/2] contrib/libvhost-user: add the protocol feature used for SET/GET message

2018-03-29 Thread Liu, Changpeng
The patch serials depend on Maxime's patch "vhost-user: back SET/GET_CONFIG 
requests with a protocol feature",
which adding the protocol feature bit for SET/GET messages.

> -Original Message-----
> From: Liu, Changpeng
> Sent: Friday, March 30, 2018 10:46 AM
> To: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org
> Cc: m...@redhat.com; marcandre.lur...@redhat.com;
> maxime.coque...@redhat.com
> Subject: [PATCH 1/2] contrib/libvhost-user: add the protocol feature used for
> SET/GET message
> 
> Signed-off-by: Changpeng Liu <changpeng@intel.com>
> ---
>  contrib/libvhost-user/libvhost-user.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/contrib/libvhost-user/libvhost-user.h 
> b/contrib/libvhost-user/libvhost-
> user.h
> index 79f7a53..b27075e 100644
> --- a/contrib/libvhost-user/libvhost-user.h
> +++ b/contrib/libvhost-user/libvhost-user.h
> @@ -50,6 +50,7 @@ enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6,
>  VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
>  VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
> +VHOST_USER_PROTOCOL_F_CONFIG = 9,
> 
>  VHOST_USER_PROTOCOL_F_MAX
>  };
> --
> 1.9.3




Re: [Qemu-devel] [PATCH v2 2/2] vhost-user: back SET/GET_CONFIG requests with a protocol feature

2018-03-29 Thread Liu, Changpeng


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 29, 2018 3:56 PM
> To: Liu, Changpeng <changpeng@intel.com>; m...@redhat.com;
> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 2/2] vhost-user: back SET/GET_CONFIG requests with a
> protocol feature
> 
> 
> 
> On 03/29/2018 02:57 AM, Liu, Changpeng wrote:
> >
> >
> >> -Original Message-
> >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> >> Sent: Thursday, March 29, 2018 3:28 AM
> >> To: m...@redhat.com; Liu, Changpeng <changpeng@intel.com>;
> >> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> >> Cc: Maxime Coquelin <maxime.coque...@redhat.com>
> >> Subject: [PATCH v2 2/2] vhost-user: back SET/GET_CONFIG requests with a
> >> protocol feature
> >>
> >> Without a dedicated protocol feature, QEMU cannot know whether
> >> the backend can handle VHOST_USER_SET_CONFIG and
> >> VHOST_USER_GET_CONFIG messages.
> >>
> >> This patch adds a protocol feature that is only advertised by
> >> QEMU if the device implements the config ops. Vhost user init
> >> fails if the device support the feature but the backend doesn't.
> >>
> >> The backend should only send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
> >> requests if the protocol feature has been negotiated.
> >>
> >> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
> >
> > Passed our own vhost-user-blk target with the patch, I can submit a fix to
> QEMU vhost-user-blk example
> > after this commit.
> >
> > Tested-by: Changpeng Liu <changpeng@intel.com>
> 
> Thanks for having tested, and for proposing to implement the example
> part. Just notice I forgot to apply your Tested-by on the series.
I have added the fix to the example based on your patch, I'm wondering
I can send it out for review now or wait for your commit being merged ?
> 
> Maxime
> >> ---
> >>   docs/interop/vhost-user.txt | 21 -
> >>   hw/virtio/vhost-user.c  | 22 ++
> >>   2 files changed, 34 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> >> index c058c407df..534caab18a 100644
> >> --- a/docs/interop/vhost-user.txt
> >> +++ b/docs/interop/vhost-user.txt
> >> @@ -379,6 +379,7 @@ Protocol features
> >>   #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
> >>   #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
> >>   #define VHOST_USER_PROTOCOL_F_PAGEFAULT  8
> >> +#define VHOST_USER_PROTOCOL_F_CONFIG 9
> >>
> >>   Master message types
> >>   
> >> @@ -664,7 +665,8 @@ Master message types
> >> Master payload: virtio device config space
> >> Slave payload: virtio device config space
> >>
> >> -  Submitted by the vhost-user master to fetch the contents of the 
> >> virtio
> >> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> >> +  submitted by the vhost-user master to fetch the contents of the 
> >> virtio
> >> device configuration space, vhost-user slave's payload size MUST 
> >> match
> >> master's request, vhost-user slave uses zero length of payload to
> >> indicate an error to vhost-user master. The vhost-user master may
> >> @@ -677,7 +679,8 @@ Master message types
> >> Master payload: virtio device config space
> >> Slave payload: N/A
> >>
> >> -  Submitted by the vhost-user master when the Guest changes the virtio
> >> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> >> +  submitted by the vhost-user master when the Guest changes the virtio
> >> device configuration space and also can be used for live migration
> >> on the destination host. The vhost-user slave must check the flags
> >> field, and slaves MUST NOT accept SET_CONFIG for read-only
> >> @@ -766,13 +769,13 @@ Slave message types
> >>Slave payload: N/A
> >>Master payload: N/A
> >>
> >> - Vhost-user slave sends such messages to notify that the virtio 
> >> device's
> >> - configuration space has changed, for those host devices which can
> support
> >> - such feature, host driver can send

Re: [Qemu-devel] [PATCH v2 2/2] vhost-user: back SET/GET_CONFIG requests with a protocol feature

2018-03-28 Thread Liu, Changpeng


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 29, 2018 3:28 AM
> To: m...@redhat.com; Liu, Changpeng <changpeng@intel.com>;
> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> Cc: Maxime Coquelin <maxime.coque...@redhat.com>
> Subject: [PATCH v2 2/2] vhost-user: back SET/GET_CONFIG requests with a
> protocol feature
> 
> Without a dedicated protocol feature, QEMU cannot know whether
> the backend can handle VHOST_USER_SET_CONFIG and
> VHOST_USER_GET_CONFIG messages.
> 
> This patch adds a protocol feature that is only advertised by
> QEMU if the device implements the config ops. Vhost user init
> fails if the device support the feature but the backend doesn't.
> 
> The backend should only send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
> requests if the protocol feature has been negotiated.
> 
> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>

Passed our own vhost-user-blk target with the patch, I can submit a fix to QEMU 
vhost-user-blk example
after this commit.

Tested-by: Changpeng Liu <changpeng@intel.com>
> ---
>  docs/interop/vhost-user.txt | 21 -
>  hw/virtio/vhost-user.c  | 22 ++
>  2 files changed, 34 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index c058c407df..534caab18a 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -379,6 +379,7 @@ Protocol features
>  #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
>  #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
>  #define VHOST_USER_PROTOCOL_F_PAGEFAULT  8
> +#define VHOST_USER_PROTOCOL_F_CONFIG 9
> 
>  Master message types
>  
> @@ -664,7 +665,8 @@ Master message types
>Master payload: virtio device config space
>Slave payload: virtio device config space
> 
> -  Submitted by the vhost-user master to fetch the contents of the virtio
> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> +  submitted by the vhost-user master to fetch the contents of the virtio
>device configuration space, vhost-user slave's payload size MUST match
>master's request, vhost-user slave uses zero length of payload to
>indicate an error to vhost-user master. The vhost-user master may
> @@ -677,7 +679,8 @@ Master message types
>Master payload: virtio device config space
>Slave payload: N/A
> 
> -  Submitted by the vhost-user master when the Guest changes the virtio
> +  When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
> +  submitted by the vhost-user master when the Guest changes the virtio
>device configuration space and also can be used for live migration
>on the destination host. The vhost-user slave must check the flags
>field, and slaves MUST NOT accept SET_CONFIG for read-only
> @@ -766,13 +769,13 @@ Slave message types
>   Slave payload: N/A
>   Master payload: N/A
> 
> - Vhost-user slave sends such messages to notify that the virtio device's
> - configuration space has changed, for those host devices which can 
> support
> - such feature, host driver can send VHOST_USER_GET_CONFIG message to
> slave
> - to get the latest content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is
> - negotiated, and slave set the VHOST_USER_NEED_REPLY flag, master must
> - respond with zero when operation is successfully completed, or non-zero
> - otherwise.
> + When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, vhost-user slave
> sends
> + such messages to notify that the virtio device's configuration space has
> + changed, for those host devices which can support such feature, host
> + driver can send VHOST_USER_GET_CONFIG message to slave to get the latest
> + content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, and slave
> set
> + the VHOST_USER_NEED_REPLY flag, master must respond with zero when
> + operation is successfully completed, or non-zero otherwise.
> 
>  VHOST_USER_PROTOCOL_F_REPLY_ACK:
>  ---
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 44aea5c0a8..cc8a24aa31 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -46,6 +46,7 @@ enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6,
>  VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
>  VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
> +VHOST_USER_PROTOCOL_F_CONFIG = 9,
>  VHOST_USER_PROTOCOL_F_MAX
>  };
> 
> @@ -1211,6 +1212,17 @@ static int vhost_user_init

Re: [Qemu-devel] [PATCH v2 1/2] vhost-user-blk: set config ops before vhost-user init

2018-03-28 Thread Liu, Changpeng


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 29, 2018 3:28 AM
> To: m...@redhat.com; Liu, Changpeng <changpeng@intel.com>;
> marcandre.lur...@redhat.com; qemu-devel@nongnu.org
> Cc: Maxime Coquelin <maxime.coque...@redhat.com>
> Subject: [PATCH v2 1/2] vhost-user-blk: set config ops before vhost-user init
> 
> As soon as vhost-user init is done, the backend may send
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, so let's set the
> notification callback before it.
> 
> Also, it will be used to know whether the device supports
> the config feature to advertize it or not.
> 
> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
> ---
>  hw/block/vhost-user-blk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index f840f07dfe..262baca432 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -259,6 +259,8 @@ static void vhost_user_blk_device_realize(DeviceState
> *dev, Error **errp)
>  s->dev.vq_index = 0;
>  s->dev.backend_features = 0;
> 
> +vhost_dev_set_config_notifier(>dev, _ops);
Please also remove the line "assert(hdev->vhost_ops);" in function 
vhost_dev_set_config_notifier at vhost.c file.
> +
>  ret = vhost_dev_init(>dev, >chardev, VHOST_BACKEND_TYPE_USER, 0);
>  if (ret < 0) {
>  error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> @@ -277,8 +279,6 @@ static void vhost_user_blk_device_realize(DeviceState
> *dev, Error **errp)
>  s->blkcfg.num_queues = s->num_queues;
>  }
> 
> -vhost_dev_set_config_notifier(>dev, _ops);
> -
>  return;
> 
>  vhost_err:
> --
> 2.14.3




Re: [Qemu-devel] virtio-blk discard support

2018-02-05 Thread Liu, Changpeng
Hi Cathy,

Actually the code development from my side has been done for a while, include
DISCARD/WRITE ZEROES/WRITE ZEROES with Discard Bit, currently
I'm writing the specification related changes, I'll send the specification
change for review firstly.

> -Original Message-
> From: Cathy Avery [mailto:cav...@redhat.com]
> Sent: Monday, February 5, 2018 10:20 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: Stefan Hajnoczi <stefa...@redhat.com>; Paolo Bonzini
> <pbonz...@redhat.com>; qemu-devel@nongnu.org
> Subject: virtio-blk discard support
> 
> Hi,
> 
> I am looking at extending virtio-blk by adding support for the discard
> request.
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg03889.html
> 
> My understanding is that you have already begun this work with [PATCH]
> virtio-blk: add DISCARD support to virtio-blk drive
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05146.html
> 
> Are you intending to finish the implementation? I don't want to start a
> development effort if you are already working on it.
> 
> Thanks,
> 
> Cathy Avery
> 
> 
> 
> 
> 



Re: [Qemu-devel] [RFC v1] block/NVMe: introduce a new vhost NVMe host device to QEMU

2018-01-29 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, January 29, 2018 11:29 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; Harris, James R <james.r.har...@intel.com>; Busch,
> Keith <keith.bu...@intel.com>; f...@redhat.com; pbonz...@redhat.com;
> m...@redhat.com
> Subject: Re: [RFC v1] block/NVMe: introduce a new vhost NVMe host device to
> QEMU
> 
> On Mon, Jan 15, 2018 at 04:01:55PM +0800, Changpeng Liu wrote:
> > NVMe 1.3 specification introduces a new NVMe ADMIN command:
> > doorbell buffer config, which can write shadow doorbell buffer
> > instead of MMIO registers, so it can improve the Guest performance
> > a lot for emulated NVMe devices inside VM.
> 
> If I understand correctly the Shadow Doorbell Buffer offers two
> optimizations:
> 
> 1. The guest driver only writes to the MMIO register when EventIdx has
>been reached.  This eliminates some MMIO writes.
Correct.
> 
> 2. The device may poll the Shadow Doorbell Buffer so that command
>processing can begin before guest driver performs an MMIO write.
> 
> Is this correct?
Guest should write shadow doorbell memory every time(each new request from 
Guest),
Guest may write PCI registers or not, depends on slave target's feedback.
Slave target can poll the shadow doorbell for new requests.
This can eliminate the MMIO writes for submission data path.
> 
> > Similar with existing vhost-user-scsi solution, this commit builds a
> > new vhost_user_nvme host device to VM and the I/O is processed at
> > the slave I/O target, so users can implement a user space NVMe driver
> > in the slave I/O target.
> >
> > Users can start QEMU with: -chardev socket,id=char0,path=/path/vhost.0 \
> > -device vhost-user-nvme,chardev=char0,num_io_queues=2.
> 
> Each new feature has a cost in terms of maintainance, testing,
> documentation, and support.  Users need to be educated about the role of
> each available storage controller and how to choose between them.
> 
> I'm not sure why QEMU should go in this direction since it makes the
> landscape more complex and harder to support.  You've said the
> performance is comparable to vhost-user-blk.  So what does NVMe offer
> that makes this worthwhile?
Good question, from the test results, this solution is almost same with
vhost-blk, this is still an ongoing work, I don't have a *MUST*
justification now.
> 
> A cool NVMe feature would be the ability to pass through invididual
> queues to different guests without SR-IOV.  In other words, bind a queue
> to namespace subset so that multiple guests can be isolated from each
> other.  That way the data path would not require vmexits.  The control
> path and device initialization would still be emulated by QEMU so the
> hardware does not need to provide the full resources and state needed
> for SR-IOV.  I looked into this but came to the conclusion that it would
> require changes to the NVMe specification because the namespace is a
> per-command field.



Re: [Qemu-devel] [RFC v1] block/NVMe: introduce a new vhost NVMe host device to QEMU

2018-01-16 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Wednesday, January 17, 2018 1:07 AM
> To: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org
> Cc: Harris, James R <james.r.har...@intel.com>; Busch, Keith
> <keith.bu...@intel.com>; f...@redhat.com; stefa...@gmail.com;
> m...@redhat.com
> Subject: Re: [RFC v1] block/NVMe: introduce a new vhost NVMe host device to
> QEMU
> 
> On 15/01/2018 09:01, Changpeng Liu wrote:
> > NVMe 1.3 specification introduces a new NVMe ADMIN command:
> > doorbell buffer config, which can write shadow doorbell buffer
> > instead of MMIO registers, so it can improve the Guest performance
> > a lot for emulated NVMe devices inside VM.
> >
> > Similar with existing vhost-user-scsi solution, this commit builds a
> > new vhost_user_nvme host device to VM and the I/O is processed at
> > the slave I/O target, so users can implement a user space NVMe driver
> > in the slave I/O target.
> >
> > Users can start QEMU with: -chardev socket,id=char0,path=/path/vhost.0 \
> > -device vhost-user-nvme,chardev=char0,num_io_queues=2.
> 
> Hi Changpeng,
> 
> I have two comments on this series.
> 
> First, the new command in NVMe 1.3 is great.  However, please first add
> support for the doorbell buffer config in hw/block/nvme.c.  There is no
> need to tie support for the new command to a completely new external
> server architecture.  Emulated NVMe can be enhanced to use iothreads and
> (when the doorbell buffer is configured) ioeventfd, and that should come
> before enhancements for external vhost-like servers.
Yes, adding doorbell buffer config support at hw/block/nvme.c is great to 
improve
the efficiency, this vhost-like idea is another solution which can provide 
end-to-end
userspace software stack.
> 
> Second, virtio-based vhost-user remains QEMU's preferred method for
> high-performance I/O in guests.  Discard support is missing and that is
> important for SSDs; that should be fixed in the virtio spec.  Are there
Previously I have a patch adding DISCARD support to virtio-blk, but I didn't 
find
the way using svn to update the spec patch, any git repository I can use
to update the virtio-blk spec ? I think I can pick up the feature again.
> any other features where virtio-blk is lagging behind NVMe?
From the efficiency consideration, I compared the solution with vhost-user-blk,
Indeed, the two solution are almost at the same level, IOPS/CPU utilization 
inside Guest/KVM events(VM_EXIT,
KVM_FAST_MMIO,KVM_MSI_SET_IRQ,KVM_MSR) are almost same.
> 
> Thanks,
> 
> Paolo


Re: [Qemu-devel] [PATCH v8 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2018-01-02 Thread Liu, Changpeng


> -Original Message-
> From: Liu, Changpeng
> Sent: Wednesday, January 3, 2018 10:07 AM
> To: 'Marc-André Lureau' <marcandre.lur...@gmail.com>
> Cc: QEMU <qemu-devel@nongnu.org>; Harris, James R <james.r.har...@intel.com>;
> Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>;
> Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi <fel...@nutanix.com>
> Subject: RE: [Qemu-devel] [PATCH v8 4/4] contrib/vhost-user-blk: introduce a
> vhost-user-blk sample application
> 
> 
> 
> > -Original Message-
> > From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> > Sent: Tuesday, January 2, 2018 11:44 PM
> > To: Liu, Changpeng <changpeng@intel.com>
> > Cc: QEMU <qemu-devel@nongnu.org>; Harris, James R
> <james.r.har...@intel.com>;
> > Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>;
> > Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi <fel...@nutanix.com>
> > Subject: Re: [Qemu-devel] [PATCH v8 4/4] contrib/vhost-user-blk: introduce a
> > vhost-user-blk sample application
> >
> > On Tue, Jan 2, 2018 at 4:55 AM, Changpeng Liu <changpeng@intel.com>
> wrote:
> > > This commit introcudes a vhost-user-blk backend device, it uses UNIX'
> >
> > introcudes -> introduces
> Thanks.
> >
> > > domain socket to communicate with QEMU. The vhost-user-blk sample
> > > application should be used with QEMU vhost-user-blk-pci device.
> > >
> > > To use it, complie with:
> > > make vhost-user-blk
> > >
> > > and start like this:
> > > vhost-user-blk -b /dev/sdb -s /path/vhost.socket
> > >
> > > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > > ---
> > >  .gitignore  |   1 +
> > >  Makefile|   3 +
> > >  Makefile.objs   |   1 +
> > >  contrib/vhost-user-blk/Makefile.objs|   1 +
> > >  contrib/vhost-user-blk/vhost-user-blk.c | 543
> > 
> > >  5 files changed, 549 insertions(+)
> > >  create mode 100644 contrib/vhost-user-blk/Makefile.objs
> > >  create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
> > >
> > > diff --git a/.gitignore b/.gitignore
> > > index 433f64f..704b222 100644
> > > --- a/.gitignore
> > > +++ b/.gitignore
> > > @@ -54,6 +54,7 @@
> > >  /module_block.h
> > >  /scsi/qemu-pr-helper
> > >  /vhost-user-scsi
> > > +/vhost-user-blk
> > >  /fsdev/virtfs-proxy-helper
> > >  *.tmp
> > >  *.[1-9]
> > > diff --git a/Makefile b/Makefile
> > > index d86ecd2..f021fc8 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -331,6 +331,7 @@ dummy := $(call unnest-vars,, \
> > >  ivshmem-server-obj-y \
> > >  libvhost-user-obj-y \
> > >  vhost-user-scsi-obj-y \
> > > +vhost-user-blk-obj-y \
> > >  qga-vss-dll-obj-y \
> > >  block-obj-y \
> > >  block-obj-m \
> > > @@ -562,6 +563,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y)
> > $(COMMON_LDADDS)
> > >  endif
> > >  vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
> > > $(call LINK, $^)
> > > +vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
> > > +   $(call LINK, $^)
> > >
> > >  module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-
> > host.mak
> > > $(call quiet-command,$(PYTHON) $< $@ \
> > > diff --git a/Makefile.objs b/Makefile.objs
> > > index 285c6f3..ae9aef7 100644
> > > --- a/Makefile.objs
> > > +++ b/Makefile.objs
> > > @@ -115,6 +115,7 @@ libvhost-user-obj-y = contrib/libvhost-user/
> > >  vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
> > >  vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
> > >  vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
> > > +vhost-user-blk-obj-y = contrib/vhost-user-blk/
> > >
> > >
> >
> ####
> > ##
> > >  trace-events-subdirs =
> > > diff --git a/contrib/vhost-user-blk/Makefile.objs b/contrib/vhost-user-
> > blk/Makefile.objs
> > > new file mode 100644
> > > index 000..72e2cdc
> > > --- /dev/

Re: [Qemu-devel] [PATCH v8 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2018-01-02 Thread Liu, Changpeng


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> Sent: Tuesday, January 2, 2018 11:44 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: QEMU <qemu-devel@nongnu.org>; Harris, James R <james.r.har...@intel.com>;
> Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>;
> Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi <fel...@nutanix.com>
> Subject: Re: [Qemu-devel] [PATCH v8 4/4] contrib/vhost-user-blk: introduce a
> vhost-user-blk sample application
> 
> On Tue, Jan 2, 2018 at 4:55 AM, Changpeng Liu <changpeng@intel.com> wrote:
> > This commit introcudes a vhost-user-blk backend device, it uses UNIX'
> 
> introcudes -> introduces
Thanks.
> 
> > domain socket to communicate with QEMU. The vhost-user-blk sample
> > application should be used with QEMU vhost-user-blk-pci device.
> >
> > To use it, complie with:
> > make vhost-user-blk
> >
> > and start like this:
> > vhost-user-blk -b /dev/sdb -s /path/vhost.socket
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  .gitignore  |   1 +
> >  Makefile|   3 +
> >  Makefile.objs   |   1 +
> >  contrib/vhost-user-blk/Makefile.objs|   1 +
> >  contrib/vhost-user-blk/vhost-user-blk.c | 543
> 
> >  5 files changed, 549 insertions(+)
> >  create mode 100644 contrib/vhost-user-blk/Makefile.objs
> >  create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
> >
> > diff --git a/.gitignore b/.gitignore
> > index 433f64f..704b222 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -54,6 +54,7 @@
> >  /module_block.h
> >  /scsi/qemu-pr-helper
> >  /vhost-user-scsi
> > +/vhost-user-blk
> >  /fsdev/virtfs-proxy-helper
> >  *.tmp
> >  *.[1-9]
> > diff --git a/Makefile b/Makefile
> > index d86ecd2..f021fc8 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -331,6 +331,7 @@ dummy := $(call unnest-vars,, \
> >  ivshmem-server-obj-y \
> >  libvhost-user-obj-y \
> >  vhost-user-scsi-obj-y \
> > +vhost-user-blk-obj-y \
> >  qga-vss-dll-obj-y \
> >  block-obj-y \
> >  block-obj-m \
> > @@ -562,6 +563,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y)
> $(COMMON_LDADDS)
> >  endif
> >  vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
> > $(call LINK, $^)
> > +vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
> > +   $(call LINK, $^)
> >
> >  module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-
> host.mak
> > $(call quiet-command,$(PYTHON) $< $@ \
> > diff --git a/Makefile.objs b/Makefile.objs
> > index 285c6f3..ae9aef7 100644
> > --- a/Makefile.objs
> > +++ b/Makefile.objs
> > @@ -115,6 +115,7 @@ libvhost-user-obj-y = contrib/libvhost-user/
> >  vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
> >  vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
> >  vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
> > +vhost-user-blk-obj-y = contrib/vhost-user-blk/
> >
> >
> 
> ##
> >  trace-events-subdirs =
> > diff --git a/contrib/vhost-user-blk/Makefile.objs b/contrib/vhost-user-
> blk/Makefile.objs
> > new file mode 100644
> > index 000..72e2cdc
> > --- /dev/null
> > +++ b/contrib/vhost-user-blk/Makefile.objs
> > @@ -0,0 +1 @@
> > +vhost-user-blk-obj-y = vhost-user-blk.o
> > diff --git a/contrib/vhost-user-blk/vhost-user-blk.c b/contrib/vhost-user-
> blk/vhost-user-blk.c
> > new file mode 100644
> > index 000..4649435
> > --- /dev/null
> > +++ b/contrib/vhost-user-blk/vhost-user-blk.c
> > @@ -0,0 +1,543 @@
> > +/*
> > + * vhost-user-blk sample application
> > + *
> > + * Copyright (c) 2017 Intel Corporation. All rights reserved.
> > + *
> > + * Author:
> > + *  Changpeng Liu <changpeng@intel.com>
> > + *
> > + * This work is based on the "vhost-user-scsi" sample and "virtio-blk" 
> > driver
> > + * implemention by:
> 
> implemention -> implementation
Thanks.
> 
> > + *  Felipe Franciosi <fel...@nutanix.com>
> > + *  Anthony Liguori <aligu...@us.ibm.com>
> > + *
> > + * This work is

Re: [Qemu-devel] [PATCH v8 3/4] contrib/libvhost-user: enable virtio config space messages

2018-01-02 Thread Liu, Changpeng


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> Sent: Tuesday, January 2, 2018 11:30 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: QEMU <qemu-devel@nongnu.org>; Harris, James R <james.r.har...@intel.com>;
> Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>;
> Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi <fel...@nutanix.com>
> Subject: Re: [Qemu-devel] [PATCH v8 3/4] contrib/libvhost-user: enable virtio
> config space messages
> 
> Hi
> 
> On Tue, Jan 2, 2018 at 4:55 AM, Changpeng Liu <changpeng@intel.com> wrote:
> > Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages in
> > libvhost-user library, users can implement their own I/O target
> > based on the library. This enable the virtio config space delivered
> > between QEMU host device and the I/O target.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  contrib/libvhost-user/libvhost-user.c | 54
> +++
> >  contrib/libvhost-user/libvhost-user.h | 34 +-
> >  2 files changed, 87 insertions(+), 1 deletion(-)
> >
> > diff --git a/contrib/libvhost-user/libvhost-user.c 
> > b/contrib/libvhost-user/libvhost-
> user.c
> > index f409bd3..9e12eb1 100644
> > --- a/contrib/libvhost-user/libvhost-user.c
> > +++ b/contrib/libvhost-user/libvhost-user.c
> > @@ -84,6 +84,8 @@ vu_request_to_string(unsigned int req)
> >  REQ(VHOST_USER_SET_SLAVE_REQ_FD),
> >  REQ(VHOST_USER_IOTLB_MSG),
> >  REQ(VHOST_USER_SET_VRING_ENDIAN),
> > +REQ(VHOST_USER_GET_CONFIG),
> > +REQ(VHOST_USER_SET_CONFIG),
> >  REQ(VHOST_USER_MAX),
> >  };
> >  #undef REQ
> > @@ -798,6 +800,53 @@ vu_set_slave_req_fd(VuDev *dev, VhostUserMsg *vmsg)
> >  }
> >
> >  static bool
> > +vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
> > +{
> > +int ret = -1;
> > +
> > +if (dev->iface->get_config) {
> > +ret = dev->iface->get_config(dev, vmsg->payload.config.region,
> > + vmsg->payload.config.size);
> > +}
> > +
> > +if (ret) {
> > +/* resize to zero to indicate an error to master */
> > +vmsg->size = 0;
> > +}
> > +
> > +return true;
> > +}
> > +
> > +static bool
> > +vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
> > +{
> > +int ret = -1;
> > +bool reply_supported = !!(dev->protocol_features &
> > + (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK));
> > +
> 
> None of the other messages in libvhost-user support the REPLY_ACK
> flag. Do you need it? If not, please drop it.
Ok, will drop it.
> 
> If you need it, it's a better idea to make the reply explicit and
> mandatory in the protocol for SET_CONFIG.
> 
> > +if (dev->iface->set_config) {
> > +ret = dev->iface->set_config(dev, vmsg->payload.config.region,
> > + vmsg->payload.config.offset,
> > + vmsg->payload.config.size,
> > + vmsg->payload.config.flags);
> > +}
> > +
> > +vmsg->size = sizeof(vmsg->payload.u64);
> > +if (!ret) {
> > +vmsg->payload.u64 = 0;
> > +} else {
> > +/* indicate an error in case reply supported */
> > +vmsg->payload.u64 = 1;
> > +}
> > +
> > +if (reply_supported) {
> > +return true;
> > +}
> > +
> > +return false;
> > +}
> > +
> > +static bool
> >  vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> >  {
> >  int do_reply = 0;
> > @@ -862,6 +911,10 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> >  return vu_set_vring_enable_exec(dev, vmsg);
> >  case VHOST_USER_SET_SLAVE_REQ_FD:
> >  return vu_set_slave_req_fd(dev, vmsg);
> > +case VHOST_USER_GET_CONFIG:
> > +return vu_get_config(dev, vmsg);
> > +case VHOST_USER_SET_CONFIG:
> > +return vu_set_config(dev, vmsg);
> >  case VHOST_USER_NONE:
> >  break;
> >  default:
> > @@ -970,6 +1023,7 @@ vu_init(VuDev *dev,
> >  dev->iface = iface;
> >  dev->log_call_fd = -1;
> >  dev->slave_fd = -1;
> > +
> 
> unrelated coding style chan

Re: [Qemu-devel] [PATCH v8 1/4] vhost-user: add new vhost user messages to support virtio config space

2018-01-02 Thread Liu, Changpeng


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> Sent: Tuesday, January 2, 2018 11:20 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: QEMU <qemu-devel@nongnu.org>; Harris, James R <james.r.har...@intel.com>;
> Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>;
> Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi <fel...@nutanix.com>
> Subject: Re: [Qemu-devel] [PATCH v8 1/4] vhost-user: add new vhost user 
> messages
> to support virtio config space
> 
>  Hi
> 
> On Tue, Jan 2, 2018 at 4:55 AM, Changpeng Liu <changpeng@intel.com> wrote:
> > Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can
> be
> > used for live migration of vhost user devices, also vhost user devices
> > can benefit from the messages to get/set virtio config space from/to the
> > I/O target. For the purpose to support virtio config space change,
> > VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event
> notifier
> > in case virtio config space change in the slave I/O target.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  docs/interop/vhost-user.txt   |  53 +
> >  hw/virtio/vhost-user.c| 118 
> > ++
> >  hw/virtio/vhost.c |  32 +++
> >  include/hw/virtio/vhost-backend.h |  12 
> >  include/hw/virtio/vhost.h |  15 +
> >  5 files changed, 230 insertions(+)
> >
> > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> > index 954771d..1788fff 100644
> > --- a/docs/interop/vhost-user.txt
> > +++ b/docs/interop/vhost-user.txt
> > @@ -116,6 +116,19 @@ Depending on the request type, payload can be:
> >  - 3: IOTLB invalidate
> >  - 4: IOTLB access fail
> >
> > + * Virtio device config space
> > +   ---
> > +   | offset | size | flags | payload |
> > +   ---
> > +
> > +   Offset: a 32-bit offset of virtio device's configuration space
> > +   Size: a 32-bit configuration space access size in bytes
> > +   Flags: a 32-bit value:
> > +- 0: Vhost master messages used for writeable fields
> > +- 1: Vhost master messages used for live migration
> > +   Payload: Size bytes array holding the contents of the virtio
> > +   device's configuration space
> > +
> >  In QEMU the vhost-user message is implemented with the following struct:
> >
> >  typedef struct VhostUserMsg {
> > @@ -129,6 +142,7 @@ typedef struct VhostUserMsg {
> >  VhostUserMemory memory;
> >  VhostUserLog log;
> >  struct vhost_iotlb_msg iotlb;
> > +VhostUserConfig config;
> >  };
> >  } QEMU_PACKED VhostUserMsg;
> >
> > @@ -596,6 +610,30 @@ Master message types
> >and expect this message once (per VQ) during device configuration
> >(ie. before the master starts the VQ).
> >
> > + * VHOST_USER_GET_CONFIG
> > +
> > +  Id: 24
> > +  Equivalent ioctl: N/A
> 
> Please document the Master payload. (msg.size != 0)
Ok, will add.
> 
> 
> > +  Slave payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master to fetch the contents of the 
> > virtio
> > +  device configuration space, vhost-user slave's payload size MUST 
> > match
> > +  master's request, vhost-user slave uses zero length of payload to
> > +  indicate an error to vhost-user master. The vhost-user master may
> > +  cache the contents to avoid repeated VHOST_USER_GET_CONFIG calls.
> > +
> > +* VHOST_USER_SET_CONFIG
> > +
> > +  Id: 25
> > +  Equivalent ioctl: N/A
> 
> Same here
Ok.
> 
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master when the Guest changes the virtio
> > +  device configuration space and also can be used for live migration
> > +  on the destination host. The vhost-user slave must check the flags
> > +  field, and slaves MUST NOT accept SET_CONFIG for read-only
> > +  configuration space fields unless the live migration bit is set.
> > +
> >  Slave message types
> >  ---
> >
> > @@ -614,6 +652,21 @@ Slave message types
> >This request should be send only when VIRTIO_F_IOMMU_PLATFORM
> feature
> >has been successfully 

Re: [Qemu-devel] [PATCH v7 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-12-21 Thread Liu, Changpeng


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> Sent: Thursday, December 21, 2017 6:48 PM
> To: Michael S. Tsirkin <m...@redhat.com>
> Cc: Liu, Changpeng <changpeng@intel.com>; QEMU  de...@nongnu.org>; Harris, James R <james.r.har...@intel.com>; Stefan Hajnoczi
> <stefa...@gmail.com>; Paolo Bonzini <pbonz...@redhat.com>; Felipe Franciosi
> <fel...@nutanix.com>
> Subject: Re: [Qemu-devel] [PATCH v7 1/4] vhost-user: add new vhost user 
> messages
> to support virtio config space
> 
> Hi
> 
> On Thu, Dec 21, 2017 at 1:21 AM, Michael S. Tsirkin <m...@redhat.com> wrote:
> > On Wed, Dec 20, 2017 at 04:47:13PM +0100, Marc-André Lureau wrote:
> >> Hi
> >>
> >> On Wed, Dec 13, 2017 at 3:29 AM, Changpeng Liu <changpeng@intel.com>
> wrote:
> >> > Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which
> can be
> >> > used for live migration of vhost user devices, also vhost user devices
> >> > can benefit from the messages to get/set virtio config space from/to the
> >> > I/O target. For the purpose to support virtio config space change,
> >> > VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> >> > in case virtio config space change in the I/O target.
> >>
> >> Why not use a new slave message instead of adding another fd to watch for?
> >>
> >> VHOST_USER_SLAVE_SET_CONFIG: notify the master of device configuration
> >> space change
> >
> >
> > Well that's a nice idea, but at v7 I'd expect it we are past such
> > fundamental suggestions.
> 
> I am sorry, I was quite busy with other work. (that happen to all of
> us). Hopefully, we still have some time for development for this
> release. Eventually, we could make this improvement on top during this
> cycle, but I would rather have protocol changes agreed before they hit
> master.
I can replace the event file descriptor for configuration space with a new 
slave message,
since the patch set was first developed based on QEMU 2.8, and the vhost slave 
channel
didn't support at that time, currently add a new slave message also make sense.
I will send v8 to cover this change and the comments you mentioned. 
Is everyone agreed that use a slave channel message about virtio configuration 
changes?
> 
> >
> > In particular Stefan wanted to require that slave is non-blocking,
> > and it's quite hard to support for the existing channel.
> 
> The notification could be non blocking, but the messages will go on
> the same channel.
> 
> The slave channel is actually not so busy and will require a single
> message (instead of notify + master get with current proposal)
> 
> Adding a slave channel message should also be easier than adding
> another event fd (both in master and slave).
> 
> >
> > It's up to the contributor whether to make this change, I'm fine
> > either way.
> >
> > Concerns such as overflow possibility below need to be addressed.
> 
> thanks
> 
> >
> >>
> >> >
> >> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> >> > ---
> >> >  docs/interop/vhost-user.txt   |  45 
> >> >  hw/virtio/vhost-user.c| 107
> ++
> >> >  hw/virtio/vhost.c |  64 +++
> >> >  include/hw/virtio/vhost-backend.h |  14 +
> >> >  include/hw/virtio/vhost.h |  16 ++
> >> >  5 files changed, 246 insertions(+)
> >> >
> >> > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> >> > index 954771d..826ba18 100644
> >> > --- a/docs/interop/vhost-user.txt
> >> > +++ b/docs/interop/vhost-user.txt
> >> > @@ -116,6 +116,19 @@ Depending on the request type, payload can be:
> >> >  - 3: IOTLB invalidate
> >> >  - 4: IOTLB access fail
> >> >
> >> > + * Virtio device config space
> >> > +   ---
> >> > +   | offset | size | flags | payload |
> >> > +   ---
> >> > +
> >> > +   Offset: a 32-bit offset of virtio device's configuration space
> >> > +   Size: a 32-bit configuration space access size in bytes
> >> > +   Flags: a 32-bit value:
> >> > +- 0: Vhost master messages used for writeable fields
> >> > +- 1: Vhost master messages used for live migration
> >> > +   Payl

Re: [Qemu-devel] [PATCH 2/2] virtio: fix IO request length in virtio SCSI/block

2017-12-19 Thread Liu, Changpeng


> -Original Message-
> From: Roman Kagan [mailto:rka...@virtuozzo.com]
> Sent: Tuesday, December 19, 2017 4:58 PM
> To: Denis V. Lunev <d...@openvz.org>
> Cc: Felipe Franciosi <fel...@nutanix.com>; Harris, James R
> <james.r.har...@intel.com>; Stefan Hajnoczi <stefa...@redhat.com>; Kevin Wolf
> <kw...@redhat.com>; Eduardo Habkost <ehabk...@redhat.com>; Michael S.
> Tsirkin <m...@redhat.com>; qemu-devel@nongnu.org; Max Reitz
> <mre...@redhat.com>; Paolo Bonzini <pbonz...@redhat.com>; Liu, Changpeng
> <changpeng@intel.com>; Richard Henderson <r...@twiddle.net>
> Subject: Re: [Qemu-devel] [PATCH 2/2] virtio: fix IO request length in virtio
> SCSI/block
> 
> On Mon, Dec 18, 2017 at 10:42:35PM +0300, Denis V. Lunev wrote:
> > On 12/18/2017 10:35 PM, Felipe Franciosi wrote:
> > >> On 18 Dec 2017, at 16:16, Harris, James R <james.r.har...@intel.com> 
> > >> wrote:
> > >>> On Dec 18, 2017, at 6:38 AM, Stefan Hajnoczi <stefa...@redhat.com> 
> > >>> wrote:
> > >>> On Fri, Dec 15, 2017 at 06:02:50PM +0300, Denis V. Lunev wrote:
> > >>>> diff --git a/include/hw/compat.h b/include/hw/compat.h
> > >>>> index 026fee9..b9be5d7 100644
> > >>>> --- a/include/hw/compat.h
> > >>>> +++ b/include/hw/compat.h
> > >>>> @@ -2,6 +2,23 @@
> > >>>> #define HW_COMPAT_H
> > >>>>
> > >>>> #define HW_COMPAT_2_11 \
> > >>>> +{\
> > >>>> +.driver   = "virtio-blk-device",\
> > >>>> +.property = "max_segments",\
> > >>>> +.value= "126",\
> > >>>> +},{\
> > >>>> +.driver   = "vhost-scsi",\
> > >>>> +.property = "max_segments",\
> > >>>> +.value= "126",\
> > >>>> +},{\
> > >>>> +.driver   = "vhost-user-scsi",\
> > >>>> +.property = "max_segments",\
> > >>>> +.value= "126",\
> > >>> Existing vhost-user-scsi slave programs might not expect up to 1022
> > >>> segments.  Hopefully we can get away with this change since there are
> > >>> relatively few vhost-user-scsi slave programs.
> > >>>
> > >>> CCed Felipe (Nutanix) and Jim (SPDK) in case they have comments.
> > >> SPDK vhost-user targets only expect max 128 segments.  They also 
> > >> pre-allocate
> I/O task structures when QEMU connects to the vhost-user device.
> > >>
> > >> Supporting up to 1022 segments would result in significantly higher 
> > >> memory
> usage, reduction in I/O queue depth processed by the vhost-user target, or 
> having
> to dynamically allocate I/O task structures - none of which are ideal.
> > >>
> > >> What if this was just bumped from 126 to 128?  I guess I’m trying to
> understand the level of guest and host I/O performance that is gained with 
> this
> patch.  One I/O per 512KB vs. one I/O per 4MB - we are still only talking 
> about a
> few hundred IO/s difference.
> > > SeaBIOS also makes the assumption that the queue size is not bigger than 
> > > 128
> elements.
> > > https://review.coreboot.org/cgit/seabios.git/tree/src/hw/virtio-ring.h#n23
> > >
> > > Perhaps a better approach is to make the value configurable (ie. add the
> "max_segments" property), but set the default to 128-2. In addition to what 
> Jim
> pointed out, I think there may be other legacy front end drivers which can 
> assume
> the ring will be at most 128 entries in size.
> > >
> > > With that, hypervisors can choose to bump the value higher if it's known 
> > > to be
> safe for their host+guest configuration.
> >
> > This should not be a problem at all IMHO. The guest is not obliged
> > to use the message of entire possible size. The guest initiates
> > request with 128 elements. Fine. QEMU is ready to this.
> 
> QEMU is, but vhost-user slaves may not be.  And there seems to be no
> vhost-user protocol message type that would allow to negotiate this
> value between the master and the slave.
> 
> So apparently the default for vhost-user-scsi has to stay the same in
> order not to break existing slaves.  I guess having it tunable via a
> property may still turn out useful.
Actually I wrote a new patch set recently for support vhost-user-blk host 
device, 
and added 2 extra vhost-user messages,  GET_CONFIG/SET_CONFIG which can let 
host device get those parameters 
from vhost-user slave target, the new added messages can get virtio device's 
configuration space from slave target,
so vhost-user-scsi may use that as well. 

> 
> Roman.


Re: [Qemu-devel] [PATCH v6 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2017-12-11 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 10:33 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v6 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk
> sample application
> 
> > +static int vub_virtio_process_req(VubDev *vdev_blk,
> > + VuVirtq *vq)
> > +{
> > +VugDev *gdev = _blk->parent;
> > +VuDev *vu_dev = >parent;
> > +VuVirtqElement *elem;
> > +uint32_t type;
> > +unsigned in_num;
> > +unsigned out_num;
> > +VubReq *req;
> > +
> > +elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement));
> > +if (!elem) {
> > +return -1;
> > +}
> > +
> > +/* refer to hw/block/virtio_blk.c */
> > +if (elem->out_num < 1 || elem->in_num < 1) {
> > +fprintf(stderr, "virtio-blk request missing headers\n");
> > +free(elem);
> > +return -1;
> > +}
> > +
> > +req = g_new0(VubReq, 1);
> > +req->vdev_blk = vdev_blk;
> > +req->vq = vq;
> > +req->elem = elem;
> > +
> > +in_num = elem->in_num;
> > +out_num = elem->out_num;
> > +
> > +/* don't support VIRTIO_F_ANY_LAYOUT and virtio 1.0 only */
> > +if (elem->out_sg[0].iov_len < sizeof(struct virtio_blk_outhdr)) {
> > +fprintf(stderr, "Invalid outhdr size\n");
> > +goto err;
> > +}
> 
> QEMU has iov_discard_front() and iov_discard_back().  They make it
> pretty easy to support VIRTIO_F_ANY_LAYOUT.  If you have time, please
> consider adding it to libvhost-user, but it's not a requirement for this
> patch series.
> 
> > +req->out = (struct virtio_blk_outhdr *)elem->out_sg[0].iov_base;
> > +out_num--;
> > +
> > +if (elem->in_sg[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) 
> > {
> > +fprintf(stderr, "Invalid inhdr size\n");
> > +goto err;
> > +}
> > +req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
> > +in_num--;
> > +
> > +type = le32toh(req->out->type);
> 
> Endianness is more complicated if you want to support both VIRTIO 1.0
> and Legacy (0.9.7).  I guess it's okay to make libvhost-user code only
> support VIRTIO 1.0.
> > +static void vub_queue_set_started(VuDev *vu_dev, int idx, bool started)
> > +{
> > +VuVirtq *vq;
> > +
> > +assert(vu_dev);
> > +
> > +if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
> > +fprintf(stderr, "VQ Index out of range: %d\n", idx);
> > +vub_panic_cb(vu_dev, NULL);
> > +return;
> > +}
> 
> Is it necessary to check num_queues?  The vhost-user master should not
> be able to enable idx >= num_queues.
Yes, this part of code can be removed.



Re: [Qemu-devel] [PATCH v6 3/4] contrib/libvhost-user: enable virtio config space messages

2017-12-11 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 10:00 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v6 3/4] contrib/libvhost-user: enable virtio config space
> messages
> 
> On Tue, Dec 05, 2017 at 02:27:18PM +0800, Changpeng Liu wrote:
> > @@ -798,6 +801,70 @@ vu_set_slave_req_fd(VuDev *dev, VhostUserMsg
> *vmsg)
> >  }
> >
> >  static bool
> > +vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
> > +{
> > +int ret = -1;
> > +
> > +if (dev->iface->get_config) {
> > +ret = dev->iface->get_config(dev, vmsg->payload.config.region,
> > + vmsg->payload.config.size);
> > +}
> > +
> > +if (ret) {
> > +/* resize to zero to indicate an error to master */
> > +vmsg->size = 0;
> > +}
> 
> Please document this error case in vhost-user.txt.  I don't remember
> reading about it.
Thanks, will add it to vhost-user.txt.



Re: [Qemu-devel] [PATCH v6 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-12-11 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 9:39 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v6 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Tue, Dec 05, 2017 at 02:27:16PM +0800, Changpeng Liu wrote:
> > +* VHOST_USER_SET_CONFIG
> > +  Id: 25
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master when the Guest changes the virtio
> > +  device configuration space and also can be used for live migration
> > +  on the destination host. The vhost-user slave must check the flags
> > +  filed, and slaves MUST NOT accept SET_CONFIG for read-only
> 
> s/filed/field/
Thanks.
> 
> > +static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t 
> > *config,
> > + uint32_t offset, uint32_t size, uint32_t 
> > flags)
> > +{
> > +uint8_t *p;
> > +bool reply_supported = virtio_has_feature(dev->protocol_features,
> > +  
> > VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > +
> > +VhostUserMsg msg = {
> > +msg.request = VHOST_USER_SET_CONFIG,
> > +msg.flags = VHOST_USER_VERSION,
> > +msg.size = VHOST_USER_CONFIG_HDR_SIZE + size,
> > +};
> > +
> > +if (reply_supported) {
> > +msg.flags |= VHOST_USER_NEED_REPLY_MASK;
> > +}
> > +
> > +msg.payload.config.offset = offset,
> > +msg.payload.config.size = size,
> > +msg.payload.config.flags = flags,
> > +p = msg.payload.config.region;
> > +memcpy(p, config + offset, size);
> 
> This function can be made more general by changing the semantics of the
> config argument:
> 
>   memcpy(p, config, size);
> 
> Now the caller can pass just a single field instead of a whole 256-byte
> config buffer.  It might be clearer to name the argument "data" or
> "region" instead of "config" though.
Good suggestion, will change it.
> 
> > @@ -1505,6 +1508,67 @@ void vhost_ack_features(struct vhost_dev *hdev,
> const int *feature_bits,
> >  }
> >  }
> >
> > +int vhost_dev_get_config(struct vhost_dev *hdev, uint8_t *config,
> > + uint32_t config_len)
> > +{
> > +assert(hdev->vhost_ops);
> > +
> > +if (hdev->vhost_ops->vhost_get_config) {
> > +return hdev->vhost_ops->vhost_get_config(hdev, config, config_len);
> > +}
> > +
> > +return 0;
> > +}
> > +
> > +int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *config,
> > + uint32_t offset, uint32_t size, uint32_t flags)
> > +{
> > +assert(hdev->vhost_ops);
> > +
> > +if (hdev->vhost_ops->vhost_set_config) {
> > +return hdev->vhost_ops->vhost_set_config(hdev, config, offset,
> > + size, flags);
> > +}
> > +
> > +return 0;
> > +}
> 
> Both vhost_dev_get_config() and vhost_dev_set_config() cannot fail
> silently.  The device will not work properly if the configuration space
> feature is not supported.
> 
> Please make these functions return an error if the callback is NULL.
Ok, will add callback check.
> 
> The vhost-blk code should also check that
> hdev->vhost_ops->vhost_set_config != NULL during realize.  This way
> users see the error when adding the device instead of at runtime when
> the function gets called.
For vhost-blk, get_config is mandatory, but for set_config, it should depend
on the feature bit: VIRTIO_BLK_F_CONFIG_WCE is enabled or not. Of course,
migration should be another case. So running time error process should be
okay.




Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-21 Thread Liu, Changpeng


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, November 21, 2017 4:45 AM
> To: Stefan Hajnoczi <stefa...@gmail.com>
> Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> pbonz...@redhat.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v5 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Mon, Nov 20, 2017 at 04:26:31PM +, Stefan Hajnoczi wrote:
> > On Fri, Nov 17, 2017 at 04:44:37AM +0800, Changpeng Liu wrote:
> > > Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which
> can be
> > > used for live migration of vhost user devices, also vhost user devices
> > > can benefit from the messages to get/set virtio config space from/to the
> > > I/O target. For the purpose to support virtio config space change,
> > > VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> > > in case virtio config space change in the I/O target.
> > >
> > > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > > ---
> > >  docs/interop/vhost-user.txt   | 39 
> > >  hw/virtio/vhost-user.c| 98
> +++
> > >  hw/virtio/vhost.c | 63 +
> > >  include/hw/virtio/vhost-backend.h |  8 
> > >  include/hw/virtio/vhost.h | 16 +++
> > >  5 files changed, 224 insertions(+)
> > >
> > > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> > > index 954771d..1b98388 100644
> > > --- a/docs/interop/vhost-user.txt
> > > +++ b/docs/interop/vhost-user.txt
> > > @@ -116,6 +116,16 @@ Depending on the request type, payload can be:
> > >  - 3: IOTLB invalidate
> > >  - 4: IOTLB access fail
> > >
> > > + * Virtio device config space
> > > +   ---
> > > +   | offset | size | payload |
> > > +   ---
> > > +
> > > +   Offset: a 32-bit offset of virtio device's configuration space
> >
> > s/of/in the/
> >
> > > +   Size: a 32-bit size of configuration space that master wanted to 
> > > change
> >
> > Is this also used for GET_CONFIG?  If yes, I suggest "a 32-bit
> > configuration space access size in bytes".
> >
> > Please mention that Size must be <= 256 bytes.
> >
> > > +   Payload: a 256-bytes array holding the contents of the virtio
> > > +   device's configuration space
> >
> > What about bytes outside the [offset, offset+size) range?  I guess they
> > must be 0 and are ignored by the master/slave.
> >
> > Would it be cleaner to make Payload a variable-sized field with Size
> > bytes?  That way it's not necessary to transfer 0s and memcpy() a subset
> > of the payload array.
> >
> > > +
> > >  In QEMU the vhost-user message is implemented with the following struct:
> > >
> > >  typedef struct VhostUserMsg {
> > > @@ -129,6 +139,7 @@ typedef struct VhostUserMsg {
> > >  VhostUserMemory memory;
> > >  VhostUserLog log;
> > >  struct vhost_iotlb_msg iotlb;
> > > +VhostUserConfig config;
> > >  };
> > >  } QEMU_PACKED VhostUserMsg;
> > >
> > > @@ -596,6 +607,34 @@ Master message types
> > >and expect this message once (per VQ) during device configuration
> > >(ie. before the master starts the VQ).
> > >
> > > + * VHOST_USER_GET_CONFIG
> > > +  Id: 24
> > > +  Equivalent ioctl: N/A
> > > +  Master payload: virtio device config space
> > > +
> > > +  Submitted by the vhost-user master to fetch the contents of the 
> > > virtio
> > > +  device configuration space. The vhost-user master may cache the 
> > > contents
> > > +  to avoid repeated VHOST_USER_GET_CONFIG calls.
> > > +
> > > +* VHOST_USER_SET_CONFIG
> > > +  Id: 25
> > > +  Equivalent ioctl: N/A
> > > +  Master payload: virtio device config space
> > > +
> > > +  Submitted by the vhost-user master when the Guest changes the 
> > > virtio
> > > +  device configuration space and also can be used for live migration
> > > +  on the destination host.
> >
> > There might be security issues if the v

Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-21 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, November 21, 2017 12:27 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v5 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Fri, Nov 17, 2017 at 04:44:37AM +0800, Changpeng Liu wrote:
> > Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can
> be
> > used for live migration of vhost user devices, also vhost user devices
> > can benefit from the messages to get/set virtio config space from/to the
> > I/O target. For the purpose to support virtio config space change,
> > VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> > in case virtio config space change in the I/O target.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  docs/interop/vhost-user.txt   | 39 
> >  hw/virtio/vhost-user.c| 98 
> > +++
> >  hw/virtio/vhost.c | 63 +
> >  include/hw/virtio/vhost-backend.h |  8 
> >  include/hw/virtio/vhost.h | 16 +++
> >  5 files changed, 224 insertions(+)
> >
> > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> > index 954771d..1b98388 100644
> > --- a/docs/interop/vhost-user.txt
> > +++ b/docs/interop/vhost-user.txt
> > @@ -116,6 +116,16 @@ Depending on the request type, payload can be:
> >  - 3: IOTLB invalidate
> >  - 4: IOTLB access fail
> >
> > + * Virtio device config space
> > +   ---
> > +   | offset | size | payload |
> > +   ---
> > +
> > +   Offset: a 32-bit offset of virtio device's configuration space
> 
> s/of/in the/
> 
> > +   Size: a 32-bit size of configuration space that master wanted to change
> 
> Is this also used for GET_CONFIG?  If yes, I suggest "a 32-bit
> configuration space access size in bytes".
ok.
> 
> Please mention that Size must be <= 256 bytes.
> 
> > +   Payload: a 256-bytes array holding the contents of the virtio
> > +   device's configuration space
> 
> What about bytes outside the [offset, offset+size) range?  I guess they
> must be 0 and are ignored by the master/slave.
> 
> Would it be cleaner to make Payload a variable-sized field with Size
> bytes?  That way it's not necessary to transfer 0s and memcpy() a subset
> of the payload array.
sounds good, but for vhost-blk driver it can call get_config for the whole 
virtio_blk_config space.
> 
> > +
> >  In QEMU the vhost-user message is implemented with the following struct:
> >
> >  typedef struct VhostUserMsg {
> > @@ -129,6 +139,7 @@ typedef struct VhostUserMsg {
> >  VhostUserMemory memory;
> >  VhostUserLog log;
> >  struct vhost_iotlb_msg iotlb;
> > +VhostUserConfig config;
> >  };
> >  } QEMU_PACKED VhostUserMsg;
> >
> > @@ -596,6 +607,34 @@ Master message types
> >and expect this message once (per VQ) during device configuration
> >(ie. before the master starts the VQ).
> >
> > + * VHOST_USER_GET_CONFIG
> > +  Id: 24
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master to fetch the contents of the 
> > virtio
> > +  device configuration space. The vhost-user master may cache the 
> > contents
> > +  to avoid repeated VHOST_USER_GET_CONFIG calls.
> > +
> > +* VHOST_USER_SET_CONFIG
> > +  Id: 25
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master when the Guest changes the virtio
> > +  device configuration space and also can be used for live migration
> > +  on the destination host.
> 
> There might be security issues if the vhost slave cannot tell whether
> SET_CONFIG is coming from the guest driver or from the master process
> (live migration).  Typically certain fields are read-only for the guest
> driver.  Maybe those fields need to be set by the master after live
> migration.
> 
> One way to solve this is adding a flags field to the message.  A special
> flag can be used for live migration so the slave knows that this
> SET_CONFIG message is allowed to write to read-only fields.
> 
Ok.
> It's also worth documenting that slaves MUST NOT accept SET_CONFIG for
> read-only configuration space fields unless the live migration bit is
> set.  Hopefully this will remind implementors to think through the
> security issues.



Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-21 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Tuesday, November 21, 2017 8:16 AM
> To: Michael S. Tsirkin <m...@redhat.com>; Stefan Hajnoczi <stefa...@gmail.com>
> Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v5 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On 20/11/2017 21:44, Michael S. Tsirkin wrote:
> > Live migrations is supposed to be migrating guest writeable state too.
> > If you mean migrating RO fields like size, then
> > I don't think it's a good idea to reuse SET_CONFIG for that.
> > SET_CONFIG should obey exactly the virtio semantics.
> >
> > And I agree, it should say that slave must treat it as a write,
> > and get config as a read according to virtio semantics.
> >
> > If someone needs to pass configuration from qemu to
> > slave, let's add specific messages with precisely defined semantics.
> 
> Fair enough, but I'd add nevertheless a 32-bit flags field to both
> GET_CONFIG and SET_CONFIG, and document that the slave MUST check that
> it is zero and otherwise fail.
Ok.
> 
> Paolo


Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-24 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, October 24, 2017 8:53 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On Tue, Oct 24, 2017 at 12:44:30AM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> > > Sent: Tuesday, October 24, 2017 1:12 AM
> > > To: Liu, Changpeng <changpeng@intel.com>
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> > > marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> > > <james.r.har...@intel.com>
> > > Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk
> host
> > > device
> > >
> > > On Mon, Oct 23, 2017 at 04:26:36AM +, Liu, Changpeng wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> > > > > Sent: Friday, October 20, 2017 5:55 PM
> > > > > To: Liu, Changpeng <changpeng@intel.com>
> > > > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> > > > > marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> > > > > <james.r.har...@intel.com>
> > > > > Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new 
> > > > > vhost-user-blk
> > > host
> > > > > device
> > > > >
> > > > > On Fri, Oct 20, 2017 at 01:47:58AM +, Liu, Changpeng wrote:
> > > > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > > > +DEFINE_PROP_UINT16("num_queues", VHostUserBlk,
> num_queues,
> > > 1),
> > > > > > > > +DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size,
> 128),
> > > > > > > > +DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, 
> > > > > > > > host_features,
> > > > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> > > > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> > > > > > > > +  VIRTIO_BLK_F_SEG_MAX, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, 
> > > > > > > > host_features,
> > > > > > > > +  VIRTIO_BLK_F_GEOMETRY, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, 
> > > > > > > > host_features,
> > > > > > > > +  VIRTIO_BLK_F_RO, false),
> > > > > > > > +DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, 
> > > > > > > > host_features,
> > > > > > > > +  VIRTIO_BLK_F_BLK_SIZE, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_topology", VHostUserBlk, 
> > > > > > > > host_features,
> > > > > > > > +  VIRTIO_BLK_F_TOPOLOGY, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk,
> host_features,
> > > > > > > > +  VIRTIO_BLK_F_MQ, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> > > > > > > > +  VIRTIO_BLK_F_FLUSH, true),
> > > > > > > > +DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> > > > > > > > +  VIRTIO_BLK_F_BARRIER, false),
> > > > > > > > +DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> > > > > > > > +  VIRTIO_BLK_F_SCSI, fa

Re: [Qemu-devel] [PATCH v4 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-10-23 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, October 24, 2017 1:26 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: Michael S. Tsirkin <m...@redhat.com>; qemu-devel@nongnu.org;
> pbonz...@redhat.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Mon, Oct 23, 2017 at 04:47:00AM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> > > Sent: Friday, October 20, 2017 6:01 PM
> > > To: Michael S. Tsirkin <m...@redhat.com>
> > > Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> > > pbonz...@redhat.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> > > Harris, James R <james.r.har...@intel.com>
> > > Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to
> support
> > > virtio config space
> > >
> > > On Thu, Oct 19, 2017 at 06:36:00PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 19, 2017 at 04:09:35PM +0200, Stefan Hajnoczi wrote:
> > > > > On Thu, Oct 19, 2017 at 01:24:07PM +0800, Changpeng Liu wrote:
> > > > > > @@ -922,6 +931,91 @@ static void 
> > > > > > vhost_user_set_iotlb_callback(struct
> > > vhost_dev *dev, int enabled)
> > > > > >  /* No-op as the receive channel is not dedicated to IOTLB 
> > > > > > messages.
> */
> > > > > >  }
> > > > > >
> > > > > > +static int vhost_user_get_config(struct vhost_dev *dev, uint8_t 
> > > > > > *config,
> > > > > > + size_t config_len)
> > > > > > +{
> > > > > > +VhostUserMsg msg = {
> > > > > > +.request = VHOST_USER_GET_CONFIG,
> > > > > > +.flags = VHOST_USER_VERSION,
> > > > > > +.size = config_len,
> > > > > > +};
> > > > > > +
> > > > > > +if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
> > > > >
> > > > > config_len should be limited to 256 bytes:
> > > > >
> > > > >   if (config_len == 0 || config_len > sizeof(msg.payload.config) {
> > > >
> > > > I would just limit it to a reasonable value, acceptable to
> > > > both master and slave, not fail if it's bigger.
> > > >
> > > >
> > > > > > +error_report("bad config length");
> > > > > > +return -1;
> > > > > > +}
> > > > > > +
> > > > > > +if (vhost_user_write(dev, , NULL, 0) < 0) {
> > > > > > +return -1;
> > > > > > +}
> > > > > > +
> > > > > > +if (vhost_user_read(dev, ) < 0) {
> > > > > > +return -1;
> > > > > > +}
> > > > > > +
> > > > > > +if (msg.request != VHOST_USER_GET_CONFIG) {
> > > > > > +error_report("Received unexpected msg type. Expected %d
> > > received %d",
> > > > > > + VHOST_USER_GET_CONFIG, msg.request);
> > > > > > +return -1;
> > > > > > +}
> > > > > > +
> > > > > > +if (msg.size != config_len) {
> > > > > > +error_report("Received bad msg size.");
> > > > > > +return -1;
> > > > > > +}
> > > > > > +
> > > > > > +memcpy(config, , config_len);
> > > > >
> > > > > There is some complexity here: different virtio devices use different
> > > > > amounts of config space.  Devices may append new fields to the config
> > > > > space to support new features.
> > > > >
> > > > > Therefore I think the simplest protocol is to always fetch the full
> > > > > 256-byte configuration space.  This way the vhost-user slave process 
> > > > > can
> > > > > implement feature bits that the master process does not know about.
> > > > >
> > > > > In other words, I don't think the master pr

Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-23 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, October 24, 2017 1:12 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On Mon, Oct 23, 2017 at 04:26:36AM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> > > Sent: Friday, October 20, 2017 5:55 PM
> > > To: Liu, Changpeng <changpeng@intel.com>
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> > > marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> > > <james.r.har...@intel.com>
> > > Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk
> host
> > > device
> > >
> > > On Fri, Oct 20, 2017 at 01:47:58AM +, Liu, Changpeng wrote:
> > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > +DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues,
> 1),
> > > > > > +DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 
> > > > > > 128),
> > > > > > +DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SEG_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_GEOMETRY, true),
> > > > > > +DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_RO, false),
> > > > > > +DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_BLK_SIZE, true),
> > > > > > +DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_TOPOLOGY, true),
> > > > > > +DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_MQ, true),
> > > > > > +DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_FLUSH, true),
> > > > > > +DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_BARRIER, false),
> > > > > > +DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SCSI, false),
> > > > > > +DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_WCE, false),
> > > > >
> > > > > Please explain how feature negotation works.  The vhost-user slave
> > > > > advertises support features in the return value from
> > > > > VHOST_USER_GET_FEATURES.  How does this additional feature mask
> work
> > > and
> > > > > why is it useful?
> > > > According to Paolo's previous comments, VIRTIO_BLK_F_WCE/
> > > VIRTIO_BLK_F_SCSI/ VIRTIO_BLK_F_BARRIER
> > > > should be removed. Here I added all the feature flags just want to avoid
> the
> > > case that vhost-user slave target
> > > > can support but Qemu vhost block driver cannot support it.
> > >
> > > Please explain a bit more how these options can be used.
> > >
> > > When I looked at the vhost code it seemed like the vhost slave can
> > > report any feature bits it wishes (even things QEMU doesn't know about).
> > > What is the purpose of override some of the feature bits on the QEMU
> > > command-line?
> > Hi Stefan,
> > Here I added a switch which can override vhost-slave's feature bits, for
> example, vhost-slave reported `VIRTIO_BLK_F_RO`,
> > but Qemu vhost-master can disable it through command line when started the
> Qemu. Users don't need to change any
> > vhost-slave's code to disable this feature, and this is also aligned with 
> > vhost-
> scsi and vhost-net's implementation.
> 
> You said vhost-master can disable features but the code doesn't seem to
> work that way:
> 
> +/* Turn on pre-defined features */
> +features |= s->host_features;
User can append parameter when started Qemu: e.g: f_readonly=false to disable 
it.
> 
> If the use case isn't clear please remove these properties for now.
I can make it the same with virtio-blk, hardcoded the mandatory features, and 
put 
VIRTIO_BLK_F_MQ/VIRTIO_BLK_F_RO/VIRTIO_BLK_F_CONFIG_WCE configurable. Thoughts?
> 
> Stefan



Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-23 Thread Liu, Changpeng


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Monday, October 23, 2017 8:55 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: Stefan Hajnoczi <stefa...@gmail.com>; qemu-devel@nongnu.org;
> pbonz...@redhat.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On Mon, Oct 23, 2017 at 04:26:36AM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> > > Sent: Friday, October 20, 2017 5:55 PM
> > > To: Liu, Changpeng <changpeng@intel.com>
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> > > marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> > > <james.r.har...@intel.com>
> > > Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk
> host
> > > device
> > >
> > > On Fri, Oct 20, 2017 at 01:47:58AM +, Liu, Changpeng wrote:
> > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > +DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues,
> 1),
> > > > > > +DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 
> > > > > > 128),
> > > > > > +DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SEG_MAX, true),
> > > > > > +DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_GEOMETRY, true),
> > > > > > +DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_RO, false),
> > > > > > +DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_BLK_SIZE, true),
> > > > > > +DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_TOPOLOGY, true),
> > > > > > +DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_MQ, true),
> > > > > > +DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_FLUSH, true),
> > > > > > +DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_BARRIER, false),
> > > > > > +DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_SCSI, false),
> > > > > > +DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> > > > > > +  VIRTIO_BLK_F_WCE, false),
> > > > >
> > > > > Please explain how feature negotation works.  The vhost-user slave
> > > > > advertises support features in the return value from
> > > > > VHOST_USER_GET_FEATURES.  How does this additional feature mask
> work
> > > and
> > > > > why is it useful?
> > > > According to Paolo's previous comments, VIRTIO_BLK_F_WCE/
> > > VIRTIO_BLK_F_SCSI/ VIRTIO_BLK_F_BARRIER
> > > > should be removed. Here I added all the feature flags just want to avoid
> the
> > > case that vhost-user slave target
> > > > can support but Qemu vhost block driver cannot support it.
> > >
> > > Please explain a bit more how these options can be used.
> > >
> > > When I looked at the vhost code it seemed like the vhost slave can
> > > report any feature bits it wishes (even things QEMU doesn't know about).
> > > What is the purpose of override some of the feature bits on the QEMU
> > > command-line?
> > Hi Stefan,
> > Here I added a switch which can override vhost-slave's feature bits, for
> example, vhost-slave reported `VIRTIO_BLK_F_RO`,
> > but Qemu vhost-master can disable it through command line when started the
> Qemu. Users don't need to change any
> > vhost-slave's code to disable this feature, and this is also aligned with 
> > vhost-
> scsi and vhost-net's implementation.
> 
> Yes but I don't see these properties in virtio_blk_properties. Please
> make the names consistent at least when virtio-blk has them.
> I am pretty sure you don't want to expose e.g. _F_SCSI.
Yes, I should remove F_SCSI/F_WCE/F_BARRIER features, Virtio-blk hardcoded 4 
features: VIRTIO_BLK_F_SEG_MAX/VIRTIO_BLK_F_GEOMETRY/
VIRTIO_BLK_F_TOPOLOGY/VIRTIO_BLK_F_BLK_SIZE, and extra several configuration 
parameters for VIRTIO_BLK_F_CONFIG_WCE/VIRTIO_BLK_F_MQ,
I can change those feature bits same as virtio-blk.
> 
> 
> > > Stefan



Re: [Qemu-devel] [PATCH v4 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-10-22 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Friday, October 20, 2017 6:01 PM
> To: Michael S. Tsirkin <m...@redhat.com>
> Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> pbonz...@redhat.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Thu, Oct 19, 2017 at 06:36:00PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 19, 2017 at 04:09:35PM +0200, Stefan Hajnoczi wrote:
> > > On Thu, Oct 19, 2017 at 01:24:07PM +0800, Changpeng Liu wrote:
> > > > @@ -922,6 +931,91 @@ static void vhost_user_set_iotlb_callback(struct
> vhost_dev *dev, int enabled)
> > > >  /* No-op as the receive channel is not dedicated to IOTLB 
> > > > messages. */
> > > >  }
> > > >
> > > > +static int vhost_user_get_config(struct vhost_dev *dev, uint8_t 
> > > > *config,
> > > > + size_t config_len)
> > > > +{
> > > > +VhostUserMsg msg = {
> > > > +.request = VHOST_USER_GET_CONFIG,
> > > > +.flags = VHOST_USER_VERSION,
> > > > +.size = config_len,
> > > > +};
> > > > +
> > > > +if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
> > >
> > > config_len should be limited to 256 bytes:
> > >
> > >   if (config_len == 0 || config_len > sizeof(msg.payload.config) {
> >
> > I would just limit it to a reasonable value, acceptable to
> > both master and slave, not fail if it's bigger.
> >
> >
> > > > +error_report("bad config length");
> > > > +return -1;
> > > > +}
> > > > +
> > > > +if (vhost_user_write(dev, , NULL, 0) < 0) {
> > > > +return -1;
> > > > +}
> > > > +
> > > > +if (vhost_user_read(dev, ) < 0) {
> > > > +return -1;
> > > > +}
> > > > +
> > > > +if (msg.request != VHOST_USER_GET_CONFIG) {
> > > > +error_report("Received unexpected msg type. Expected %d
> received %d",
> > > > + VHOST_USER_GET_CONFIG, msg.request);
> > > > +return -1;
> > > > +}
> > > > +
> > > > +if (msg.size != config_len) {
> > > > +error_report("Received bad msg size.");
> > > > +return -1;
> > > > +}
> > > > +
> > > > +memcpy(config, , config_len);
> > >
> > > There is some complexity here: different virtio devices use different
> > > amounts of config space.  Devices may append new fields to the config
> > > space to support new features.
> > >
> > > Therefore I think the simplest protocol is to always fetch the full
> > > 256-byte configuration space.  This way the vhost-user slave process can
> > > implement feature bits that the master process does not know about.
> > >
> > > In other words, I don't think the master process knows how much of the
> > > config space is used so it should always request 256 bytes.
> >
> > Each device knows the max config space size.
> >
> > vdev->config_len = config_size;
> 
> I see you're referring to the field that is set in:
> 
>   void virtio_init(VirtIODevice *vdev, const char *name,
>uint16_t device_id, size_t config_size)
> 
> How does this work for vhost-user where different slave programs may
> offer different config sizes?
Each Qemu vhost controller e.g: vhost-user-scsi-pci and vhost-user-blk-pci 
should has different char devices, 
so vhost-slave knows those messages are from vhost-scsi or vhost-blk, of 
course, each UNIX domain socket
should be assigned by users with types: vhsot-scsi or vhost-blk.  
> 
> The QEMU master process does not know the correct size ahead of time.
> The size depends on the vhost-user slave process.  This is why I suggest
> using the full 256 bytes that the VIRTIO spec defines.



Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-22 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Friday, October 20, 2017 5:55 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On Fri, Oct 20, 2017 at 01:47:58AM +, Liu, Changpeng wrote:
> > > > +static Property vhost_user_blk_properties[] = {
> > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > +DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues, 1),
> > > > +DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 128),
> > > > +DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > +DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_SIZE_MAX, true),
> > > > +DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_SEG_MAX, true),
> > > > +DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_GEOMETRY, true),
> > > > +DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_RO, false),
> > > > +DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_BLK_SIZE, true),
> > > > +DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_TOPOLOGY, true),
> > > > +DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_MQ, true),
> > > > +DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_FLUSH, true),
> > > > +DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_BARRIER, false),
> > > > +DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_SCSI, false),
> > > > +DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> > > > +  VIRTIO_BLK_F_WCE, false),
> > >
> > > Please explain how feature negotation works.  The vhost-user slave
> > > advertises support features in the return value from
> > > VHOST_USER_GET_FEATURES.  How does this additional feature mask work
> and
> > > why is it useful?
> > According to Paolo's previous comments, VIRTIO_BLK_F_WCE/
> VIRTIO_BLK_F_SCSI/ VIRTIO_BLK_F_BARRIER
> > should be removed. Here I added all the feature flags just want to avoid the
> case that vhost-user slave target
> > can support but Qemu vhost block driver cannot support it.
> 
> Please explain a bit more how these options can be used.
> 
> When I looked at the vhost code it seemed like the vhost slave can
> report any feature bits it wishes (even things QEMU doesn't know about).
> What is the purpose of override some of the feature bits on the QEMU
> command-line?
Hi Stefan,
Here I added a switch which can override vhost-slave's feature bits, for 
example, vhost-slave reported `VIRTIO_BLK_F_RO`,
but Qemu vhost-master can disable it through command line when started the 
Qemu. Users don't need to change any
vhost-slave's code to disable this feature, and this is also aligned with 
vhost-scsi and vhost-net's implementation.
> 
> Stefan



Re: [Qemu-devel] [PATCH v4 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-10-19 Thread Liu, Changpeng


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Friday, October 20, 2017 10:12 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>; qemu-devel@nongnu.org;
> stefa...@gmail.com; marcandre.lur...@redhat.com; fel...@nutanix.com; Harris,
> James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Fri, Oct 20, 2017 at 01:55:20AM +, Liu, Changpeng wrote:
> >
> >
> > > -Original Message-
> > > From: Michael S. Tsirkin [mailto:m...@redhat.com]
> > > Sent: Friday, October 20, 2017 8:28 AM
> > > To: Paolo Bonzini <pbonz...@redhat.com>
> > > Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> > > stefa...@gmail.com; marcandre.lur...@redhat.com; fel...@nutanix.com;
> Harris,
> > > James R <james.r.har...@intel.com>
> > > Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to 
> > > support
> > > virtio config space
> > >
> > > On Thu, Oct 19, 2017 at 11:04:48PM +0200, Paolo Bonzini wrote:
> > > > On 19/10/2017 19:43, Michael S. Tsirkin wrote:
> > > > > On Thu, Oct 19, 2017 at 05:43:18PM +0200, Paolo Bonzini wrote:
> > > > >> On 19/10/2017 17:39, Michael S. Tsirkin wrote:
> > > > >>>> Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages
> > > which can be
> > > > >>>> used for live migration of vhost user devices, also vhost user 
> > > > >>>> devices
> > > > >>>> can benefit from the messages to get/set virtio config space 
> > > > >>>> from/to the
> > > > >>>> I/O target. For the purpose to support virtio config space change,
> > > > >>>> VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> > > > >>>> in case virtio config space change in the I/O target.
> > > > >>>>
> > > > >>>> Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > > > >>> I don't much like it that config is getting passed through.
> > > > >>>
> > > > >>> IMO this makes managing things harder not easier.
> > > > >>>
> > > > >>> How about specific messages about specific parts of
> > > > >>> config space that you want to get from the backend?
> > > > >>
> > > > >> In the case of virtio-blk that would be all of it.  Do you have a 
> > > > >> case
> > > > >> in mind where some part of the configuration space is owned by QEMU?
> > > > >>
> > > > >> Paolo
> > > > >
> > > > > Yes. seg_max
> > > >
> > > > The seg_max limit is established by whoever reads buffers from the vring
> > > > and passes them down to the lower layer.  For vhost-blk that's the
> > > > device server, not QEMU.
> > > >
> > > > Paolo
> > >
> > > Good point. How about num_queues though?
> > num_queues  is part of virtio_blk config, vhost-user slave can set it,
> > and Qemu driver can rewrite it if user want less IO queues.
> 
> Fundamentally QEMU needs to support this # of queues for this
> device.
> 
> So whenever QEMU doesn't always expose config space as-is,
> need to document the exact semantics.
Agreed, Qemu vhost block driver should always has a default value, so I also 
added the
value as one of the parameters for vhost block driver.
> 
> Also, does backend need to know?
vhost-user slave does know how many queues are used, because vhost-user messages
such as SET_VRING_CALL/KICK are related with queues. Here the idea is 
vhost-user slave
provides the maximum io queues supported, and Qemu users can specify lower io 
queues.
> 
> 
> > >
> > > Also why is there SET_CONFIG? Does not look like blk uses it.
> > Only one possible usage when disable write cache to vhost-user slave device.
> 
> Again need to add documentation what can be written.
Agreed.
> 
> 
> > >
> > > And I wonder how do we do it for other devices.
> > >
> > > E.g. for net there's a bit in the middle of the
> > > config field that deals with migration.
> > Well, I'm okay to make those messages only valid for virtio block device, 
> > because
> it's enough
> > for virtio block to be started with vhost-user slave target.
> 
> OK but I'd rather make them at least somewhat generic so we can reuse
> them down the road.  It looks like adding offset/size pair would solve
> most of the issues. Thoughts?
Do you mean SET_CONFIG message followed with offset/size to let vhost-user slave
Know which field the master want to change?  Yes, sounds good to me.
> 
> > >
> > >
> > > --
> > > MST



Re: [Qemu-devel] [PATCH v4 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-10-19 Thread Liu, Changpeng


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Friday, October 20, 2017 8:28 AM
> To: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org;
> stefa...@gmail.com; marcandre.lur...@redhat.com; fel...@nutanix.com; Harris,
> James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Thu, Oct 19, 2017 at 11:04:48PM +0200, Paolo Bonzini wrote:
> > On 19/10/2017 19:43, Michael S. Tsirkin wrote:
> > > On Thu, Oct 19, 2017 at 05:43:18PM +0200, Paolo Bonzini wrote:
> > >> On 19/10/2017 17:39, Michael S. Tsirkin wrote:
> > >>>> Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages
> which can be
> > >>>> used for live migration of vhost user devices, also vhost user devices
> > >>>> can benefit from the messages to get/set virtio config space from/to 
> > >>>> the
> > >>>> I/O target. For the purpose to support virtio config space change,
> > >>>> VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> > >>>> in case virtio config space change in the I/O target.
> > >>>>
> > >>>> Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > >>> I don't much like it that config is getting passed through.
> > >>>
> > >>> IMO this makes managing things harder not easier.
> > >>>
> > >>> How about specific messages about specific parts of
> > >>> config space that you want to get from the backend?
> > >>
> > >> In the case of virtio-blk that would be all of it.  Do you have a case
> > >> in mind where some part of the configuration space is owned by QEMU?
> > >>
> > >> Paolo
> > >
> > > Yes. seg_max
> >
> > The seg_max limit is established by whoever reads buffers from the vring
> > and passes them down to the lower layer.  For vhost-blk that's the
> > device server, not QEMU.
> >
> > Paolo
> 
> Good point. How about num_queues though?
num_queues  is part of virtio_blk config, vhost-user slave can set it, 
and Qemu driver can rewrite it if user want less IO queues.
> 
> Also why is there SET_CONFIG? Does not look like blk uses it.
Only one possible usage when disable write cache to vhost-user slave device.
> 
> And I wonder how do we do it for other devices.
> 
> E.g. for net there's a bit in the middle of the
> config field that deals with migration.
Well, I'm okay to make those messages only valid for virtio block device, 
because it's enough
for virtio block to be started with vhost-user slave target.
> 
> 
> --
> MST



Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-19 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Thursday, October 19, 2017 11:18 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On Thu, Oct 19, 2017 at 01:24:08PM +0800, Changpeng Liu wrote:
> > This commit introduces a new vhost-user device for block, it uses a
> > chardev to connect with the backend, same with Qemu virito-blk device,
> > Guest OS still uses the virtio-blk frontend driver.
> >
> > To use it, start Qemu with command line like this:
> >
> > qemu-system-x86_64 \
> > -chardev socket,id=char0,path=/path/vhost.socket \
> > -device vhost-user-blk-pci,chardev=char0,num_queues=1, \
> > bootindex=2... \
> >
> > Users can use different parameters for `num_queues` and `bootindex`.
> >
> > Different with exist Qemu virtio-blk host device, it makes more easy
> > for users to implement their own I/O processing logic, such as all
> > user space I/O stack against hardware block device. It uses the new
> > vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> > information from backend process.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  configure  |  11 ++
> >  hw/block/Makefile.objs |   3 +
> >  hw/block/vhost-user-blk.c  | 360
> +
> >  hw/virtio/virtio-pci.c |  55 ++
> >  hw/virtio/virtio-pci.h |  18 ++
> >  include/hw/virtio/vhost-user-blk.h |  40 +
> >  6 files changed, 487 insertions(+)
> >  create mode 100644 hw/block/vhost-user-blk.c
> >  create mode 100644 include/hw/virtio/vhost-user-blk.h
> >
> > diff --git a/configure b/configure
> > index 663e908..f2b348f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -318,6 +318,7 @@ tcg="yes"
> >
> >  vhost_net="no"
> >  vhost_scsi="no"
> > +vhost_user_blk="no"
> >  vhost_vsock="no"
> >  vhost_user=""
> >  kvm="no"
> > @@ -782,6 +783,7 @@ Linux)
> >kvm="yes"
> >vhost_net="yes"
> >vhost_scsi="yes"
> > +  vhost_user_blk="yes"
> >vhost_vsock="yes"
> >QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers
> $QEMU_INCLUDES"
> >supported_os="yes"
> > @@ -1139,6 +1141,10 @@ for opt do
> >;;
> >--enable-vhost-scsi) vhost_scsi="yes"
> >;;
> > +  --disable-vhost-user-blk) vhost_user_blk="no"
> > +  ;;
> > +  --enable-vhost-user-blk) vhost_user_blk="yes"
> > +  ;;
> >--disable-vhost-vsock) vhost_vsock="no"
> >;;
> >--enable-vhost-vsock) vhost_vsock="yes"
> > @@ -1511,6 +1517,7 @@ disabled with --disable-FEATURE, default is enabled if
> available:
> >cap-ng  libcap-ng support
> >attrattr and xattr support
> >vhost-net   vhost-net acceleration support
> > +  vhost-user-blk  VM virtio-blk acceleration in user space
> >spice   spice
> >rbd rados block device (rbd)
> >libiscsiiscsi support
> > @@ -5417,6 +5424,7 @@ echo "posix_madvise $posix_madvise"
> >  echo "libcap-ng support $cap_ng"
> >  echo "vhost-net support $vhost_net"
> >  echo "vhost-scsi support $vhost_scsi"
> > +echo "vhost-user-blk support $vhost_user_blk"
> >  echo "vhost-vsock support $vhost_vsock"
> >  echo "vhost-user support $vhost_user"
> >  echo "Trace backends$trace_backends"
> > @@ -5845,6 +5853,9 @@ fi
> >  if test "$vhost_scsi" = "yes" ; then
> >echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
> >  fi
> > +if test "$vhost_user_blk" = "yes" -a "$vhost_user" = "yes"; then
> > +  echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
> > +fi
> >  if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
> >echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
> >  fi
> >

Re: [Qemu-devel] [PATCH v4 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2017-10-19 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Thursday, October 19, 2017 7:44 PM
> To: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org
> Cc: stefa...@gmail.com; m...@redhat.com; marcandre.lur...@redhat.com;
> fel...@nutanix.com; Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk
> sample application
> 
> On 19/10/2017 07:24, Changpeng Liu wrote:
> >
> > +}
> > +
> > +static void
> > +vub_flush(VubReq *req)
> > +{
> > +VubDev *vdev_blk = req->vdev_blk;
> > +
> > +if (vdev_blk->blk_fd) {
> > +fsync(vdev_blk->blk_fd);
> > +}
> > +}
> > +
> 
> No need to check the file descriptor---vub_readv and vub_writev aren't
> checking it either.  Also please use fdatasync instead of fsync.
Ok.
> 
> > +static uint64_t
> > +vub_get_features(VuDev *dev)
> > +{
> > +return 1ull << VIRTIO_BLK_F_SIZE_MAX |
> > +   1ull << VIRTIO_BLK_F_SEG_MAX |
> > +   1ull << VIRTIO_BLK_F_TOPOLOGY |
> > +   1ull << VIRTIO_BLK_F_BLK_SIZE |
> > +   1ull << VIRTIO_F_VERSION_1 |
> > +   1ull << VHOST_USER_F_PROTOCOL_FEATURES;
> > +}
> 
> VIRTIO_BLK_F_FLUSH is missing.
Yes, will add.
> 
> Thanks,
> 
> Paolo


Re: [Qemu-devel] [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-10-19 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Thursday, October 19, 2017 7:33 PM
> To: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org
> Cc: stefa...@gmail.com; m...@redhat.com; marcandre.lur...@redhat.com;
> fel...@nutanix.com; Harris, James R <james.r.har...@intel.com>
> Subject: Re: [PATCH v4 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> On 19/10/2017 07:24, Changpeng Liu wrote:
> >;;
> >--enable-vhost-scsi) vhost_scsi="yes"
> >;;
> > +  --disable-vhost-user-blk) vhost_user_blk="no"
> > +  ;;
> > +  --enable-vhost-user-blk) vhost_user_blk="yes"
> > +  ;;
> >--disable-vhost-vsock) vhost_vsock="no"
> >;;
> >--enable-vhost-vsock) vhost_vsock="yes"
> > @@ -1511,6 +1517,7 @@ disabled with --disable-FEATURE, default is enabled if
> available:
> >cap-ng  libcap-ng support
> >attrattr and xattr support
> >vhost-net   vhost-net acceleration support
> > +  vhost-user-blk  VM virtio-blk acceleration in user space
> 
> Please use default-configs instead of a new configure switch.  See how
> CONFIG_VHOST_USER_SCSI is used in default-configs/pci.mak and
> default-configs/s390x-softmmu.mak.
Ok, thanks.
> 
> >
> > +static const int user_feature_bits[] = {
> > +VIRTIO_BLK_F_SIZE_MAX,
> > +VIRTIO_BLK_F_SEG_MAX,
> > +VIRTIO_BLK_F_GEOMETRY,
> > +VIRTIO_BLK_F_BLK_SIZE,
> > +VIRTIO_BLK_F_TOPOLOGY,
> > +VIRTIO_BLK_F_SCSI,
> 
> Please omit VIRTIO_BLK_F_SCSI, it's a legacy option that is not anymore
> part of virtio 1.0.
ok
> 
> > +VIRTIO_BLK_F_MQ,
> > +VIRTIO_BLK_F_RO,
> > +VIRTIO_BLK_F_FLUSH,
> > +VIRTIO_BLK_F_BARRIER,
> 
> Same for VIRTIO_BLK_F_BARRIER.
> 
> > +VIRTIO_BLK_F_WCE,
> 
> And VIRTIO_BLK_F_WCE is the same as VIRTIO_BLK_F_FLUSH, so it can be
> removed too.  Please include VIRTIO_BLK_F_CONFIG_WCE instead, since you
> are supporting it in vhost_user_blk_set_config.
Ok.
> 
> > +VIRTIO_F_VERSION_1,
> > +VIRTIO_RING_F_INDIRECT_DESC,
> > +VIRTIO_RING_F_EVENT_IDX,
> > +VIRTIO_F_NOTIFY_ON_EMPTY,
> > +VHOST_INVALID_FEATURE_BIT
> > +};
> 
> >
> > +static const TypeInfo vhost_user_blk_info = {
> > +.name = TYPE_VHOST_USER_BLK,
> > +.parent = TYPE_VIRTIO_DEVICE,
> > +.instance_size = sizeof(VHostUserBlk),
> > +.instance_init = vhost_user_blk_instance_init,
> > +.class_init = vhost_user_blk_class_init,
> > +};
> > +
> 
> There is some code duplication, so maybe it's worth introducing a common
> superclass like TYPE_VIRTIO_SCSI_COMMON.  I'll let others comment on
> whether this is a requirement.
> 
> Paolo


Re: [Qemu-devel] [PATCH 00/27] vhost-user-scsi: code clean-up

2017-09-11 Thread Liu, Changpeng
Hi Marcandre,

Of course, I agree your cleanup patch set should go first, just re-post the 
patch set
after the new Qemu release, I think the first 3 patches don't have dependency on
your  patch set.

From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com] 
Sent: Monday, September 11, 2017 9:18 PM
To: Liu, Changpeng <changpeng@intel.com>; qemu-devel@nongnu.org
Cc: fel...@nutanix.com; Philippe Mathieu-Daudé <f4...@amsat.org>
Subject: Re: [Qemu-devel] [PATCH 00/27] vhost-user-scsi: code clean-up

Hi

On Thu, Aug 24, 2017 at 2:41 AM Liu, Changpeng <changpeng@intel.com> wrote:
Thanks for helping the example code clean-up. I can rebase the vhost-user-blk 
patch
set after your commits. For the issue you mentioned below, I think the 
vhost-user-scsi example
can only support 1 LUN, so LUN0 is exist, LUN1 reported error should be okay.

I think this series should go first before your vhost-user-blk, to avoid code 
churn. Please help review. 

Patches missing review: 1, 2, 3, 5, 8, 9, 10, 11, 13, 17, 20, 22, 24, 25, 26

Thanks


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@redhat.com]
> Sent: Thursday, August 24, 2017 12:20 AM
> To: qemu-devel@nongnu.org
> Cc: Liu, Changpeng <changpeng@intel.com>; fel...@nutanix.com; Marc-André
> Lureau <marcandre.lur...@redhat.com>
> Subject: [PATCH 00/27] vhost-user-scsi: code clean-up
>
> Hi,
>
> While reviewing vhost-user-blk, I realized a lot of code was based on
> vhost-user-scsi, and I found a number of improvements could be
> made. As a result in this series, I tried to move common glib code in
> libvhost-user-glib. (I originally made libvhost-user glib-free, so if
> external projects want to play with it, they don't have to depend on
> glib, for ex vhost-user-bridge doesn't use glib).
>
> I haven't done extensive testing, I tried to setup a LUN with help
> from https://fedoraproject.org/wiki/Scsi-target-utils_Quickstart_Guide, but
> the guest says "Unexpected response from lun 1 while scanning, scan
> aborted" (before or after the series). Help welcome!
>
> Thanks
>
> Marc-André Lureau (27):
>   glib-compat: move G_SOURCE_CONTINUE/REMOVE there
>   libvhost-user: drop dependency on glib
>   libvhost-user: improve vu_queue_pop() doc
>   vhost-user-scsi: use g_strdup()
>   vhost-user-scsi: connect unix socket before allocating
>   vhost-user-scsi: code style fixes
>   vhost-user-scsi: use glib allocation
>   vhost-user-scsi: glib calls that allocate don't return NULL
>   vhost-user-scsi: also free the gtree
>   vhost-user-scsi: remove vdev_scsi_find_by_vu()
>   vhost-user-scsi: simplify unix path cleanup
>   vhost-user-scsi: use NULL pointer
>   vhost-user-scsi: use glib watch directly
>   vhost-user-scsi: assert() in iscsi_add_lun()
>   vhost-user-scsi: remove vdev_scsi_add_iscsi_lun()
>   vhost-user-scsi: remove VUS_MAX_LUNS
>   vhost-user-scsi: remove unimplemented functions
>   vhost-user-scsi: rename VUS types
>   vhost-user-scsi: avoid use of iscsi_ namespace
>   vhost-user-scsi: don't copy iscsi/scsi-lowlevel.h
>   vhost-user-scsi: drop extra callback pointer
>   vhost-user-scsi: simplify source handling
>   vhost-user-scsi: use glib logging
>   libvhost-user: add glib source helper
>   build-sys: fix libvhost-user.a build
>   vhost-user-scsi: use libvhost-user glib helper
>   vhost-user-scsi: remove server_sock from VusDev
>
>  contrib/libvhost-user/libvhost-user-glib.h |  32 ++
>  contrib/libvhost-user/libvhost-user.h      |   3 +-
>  include/glib-compat.h                      |   7 +
>  contrib/libvhost-user/libvhost-user-glib.c | 145 +++
>  contrib/libvhost-user/libvhost-user.c      |  25 +-
>  contrib/vhost-user-scsi/vhost-user-scsi.c  | 619 
>+
>  Makefile                                   |   3 +-
>  Makefile.objs                              |   3 +-
>  contrib/libvhost-user/Makefile.objs        |   2 +-
>  tests/Makefile.include                     |   2 +-
>  10 files changed, 320 insertions(+), 521 deletions(-)
>  create mode 100644 contrib/libvhost-user/libvhost-user-glib.h
>  create mode 100644 contrib/libvhost-user/libvhost-user-glib.c
>
> --
> 2.14.1.146.gd35faa819
-- 
Marc-André Lureau


Re: [Qemu-devel] [PATCH 00/27] vhost-user-scsi: code clean-up

2017-08-23 Thread Liu, Changpeng
Thanks for helping the example code clean-up. I can rebase the vhost-user-blk 
patch
set after your commits. For the issue you mentioned below, I think the 
vhost-user-scsi example
can only support 1 LUN, so LUN0 is exist, LUN1 reported error should be okay.

> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@redhat.com]
> Sent: Thursday, August 24, 2017 12:20 AM
> To: qemu-devel@nongnu.org
> Cc: Liu, Changpeng <changpeng@intel.com>; fel...@nutanix.com; Marc-André
> Lureau <marcandre.lur...@redhat.com>
> Subject: [PATCH 00/27] vhost-user-scsi: code clean-up
> 
> Hi,
> 
> While reviewing vhost-user-blk, I realized a lot of code was based on
> vhost-user-scsi, and I found a number of improvements could be
> made. As a result in this series, I tried to move common glib code in
> libvhost-user-glib. (I originally made libvhost-user glib-free, so if
> external projects want to play with it, they don't have to depend on
> glib, for ex vhost-user-bridge doesn't use glib).
> 
> I haven't done extensive testing, I tried to setup a LUN with help
> from https://fedoraproject.org/wiki/Scsi-target-utils_Quickstart_Guide, but
> the guest says "Unexpected response from lun 1 while scanning, scan
> aborted" (before or after the series). Help welcome!
> 
> Thanks
> 
> Marc-André Lureau (27):
>   glib-compat: move G_SOURCE_CONTINUE/REMOVE there
>   libvhost-user: drop dependency on glib
>   libvhost-user: improve vu_queue_pop() doc
>   vhost-user-scsi: use g_strdup()
>   vhost-user-scsi: connect unix socket before allocating
>   vhost-user-scsi: code style fixes
>   vhost-user-scsi: use glib allocation
>   vhost-user-scsi: glib calls that allocate don't return NULL
>   vhost-user-scsi: also free the gtree
>   vhost-user-scsi: remove vdev_scsi_find_by_vu()
>   vhost-user-scsi: simplify unix path cleanup
>   vhost-user-scsi: use NULL pointer
>   vhost-user-scsi: use glib watch directly
>   vhost-user-scsi: assert() in iscsi_add_lun()
>   vhost-user-scsi: remove vdev_scsi_add_iscsi_lun()
>   vhost-user-scsi: remove VUS_MAX_LUNS
>   vhost-user-scsi: remove unimplemented functions
>   vhost-user-scsi: rename VUS types
>   vhost-user-scsi: avoid use of iscsi_ namespace
>   vhost-user-scsi: don't copy iscsi/scsi-lowlevel.h
>   vhost-user-scsi: drop extra callback pointer
>   vhost-user-scsi: simplify source handling
>   vhost-user-scsi: use glib logging
>   libvhost-user: add glib source helper
>   build-sys: fix libvhost-user.a build
>   vhost-user-scsi: use libvhost-user glib helper
>   vhost-user-scsi: remove server_sock from VusDev
> 
>  contrib/libvhost-user/libvhost-user-glib.h |  32 ++
>  contrib/libvhost-user/libvhost-user.h  |   3 +-
>  include/glib-compat.h  |   7 +
>  contrib/libvhost-user/libvhost-user-glib.c | 145 +++
>  contrib/libvhost-user/libvhost-user.c  |  25 +-
>  contrib/vhost-user-scsi/vhost-user-scsi.c  | 619 
> +
>  Makefile   |   3 +-
>  Makefile.objs  |   3 +-
>  contrib/libvhost-user/Makefile.objs|   2 +-
>  tests/Makefile.include |   2 +-
>  10 files changed, 320 insertions(+), 521 deletions(-)
>  create mode 100644 contrib/libvhost-user/libvhost-user-glib.h
>  create mode 100644 contrib/libvhost-user/libvhost-user-glib.c
> 
> --
> 2.14.1.146.gd35faa819



Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-08-09 Thread Liu, Changpeng


> -Original Message-
> From: Marc-André Lureau [mailto:marcandre.lur...@redhat.com]
> Sent: Wednesday, August 9, 2017 11:39 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; stefa...@gmail.com; pbonz...@redhat.com;
> m...@redhat.com; fel...@nutanix.com; Harris, James R
> <james.r.har...@intel.com>
> Subject: Re: [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk 
> host
> device
> 
> Hi
> 
> - Original Message -
> > This commit introduces a new vhost-user device for block, it uses a
> > chardev to connect with the backend, same with Qemu virito-blk device,
> > Guest OS still uses the virtio-blk frontend driver.
> >
> > To use it, start Qemu with command line like this:
> >
> > qemu-system-x86_64 \
> > -chardev socket,id=char0,path=/path/vhost.socket \
> > -device vhost-user-blk-pci,chardev=char0,num_queues=...
> >
> > Different with exist Qemu virtio-blk host device, it makes more easy
> > for users to implement their own I/O processing logic, such as all
> > user space I/O stack against hardware block device. It uses the new
> > vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> > information from backend process.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  configure  |  11 ++
> >  hw/block/Makefile.objs |   3 +
> >  hw/block/vhost-user-blk.c  | 360
> >  +
> >  hw/virtio/virtio-pci.c |  55 ++
> >  hw/virtio/virtio-pci.h |  18 ++
> >  include/hw/virtio/vhost-user-blk.h |  40 +
> >  6 files changed, 487 insertions(+)
> >  create mode 100644 hw/block/vhost-user-blk.c
> >  create mode 100644 include/hw/virtio/vhost-user-blk.h
> >
> > diff --git a/configure b/configure
> > index dd73cce..1452c66 100755
> > --- a/configure
> > +++ b/configure
> > @@ -305,6 +305,7 @@ tcg="yes"
> >
> >  vhost_net="no"
> >  vhost_scsi="no"
> > +vhost_user_blk="no"
> >  vhost_vsock="no"
> >  vhost_user=""
> >  kvm="no"
> > @@ -779,6 +780,7 @@ Linux)
> >kvm="yes"
> >vhost_net="yes"
> >vhost_scsi="yes"
> > +  vhost_user_blk="yes"
> >vhost_vsock="yes"
> >QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers
> >$QEMU_INCLUDES"
> >supported_os="yes"
> > @@ -1136,6 +1138,10 @@ for opt do
> >;;
> >--enable-vhost-scsi) vhost_scsi="yes"
> >;;
> > +  --disable-vhost-user-blk) vhost_user_blk="no"
> > +  ;;
> > +  --enable-vhost-user-blk) vhost_user_blk="yes"
> > +  ;;
> 
> I suggest we don't add yet another configure option, but reuse the recently
> introduced --enable-vhost-user (that should cover all vhost-user devices for 
> now,
> but may learn to enable specific devices if needed in the future).
Yes, I noticed there is a new vhost-user configuration, sounds good to me if 
other devices
such as vhost-net and vhost-scsi also use the same configuration option.
> 
> >--disable-vhost-vsock) vhost_vsock="no"
> >;;
> >--enable-vhost-vsock) vhost_vsock="yes"
> > @@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if
> > available:
> >cap-ng  libcap-ng support
> >attrattr and xattr support
> >vhost-net   vhost-net acceleration support
> > +  vhost-user-blk  VM virtio-blk acceleration in user space
> >spice   spice
> >rbd rados block device (rbd)
> >libiscsiiscsi support
> > @@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
> >  echo "libcap-ng support $cap_ng"
> >  echo "vhost-net support $vhost_net"
> >  echo "vhost-scsi support $vhost_scsi"
> > +echo "vhost-user-blk support $vhost_user_blk"
> >  echo "vhost-vsock support $vhost_vsock"
> >  echo "vhost-user support $vhost_user"
> >  echo "Trace backends$trace_backends"
> > @@ -5776,6 +5784,9 @@ fi
> >  if test "$vhost_scsi" = "yes" ; then
> >echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
> >  fi
> > +if test "$vhost_user_blk" = "yes" ; then
> > +  echo "CONFIG_VHOST_USER_BLK=y" >> $

Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-31 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Monday, July 31, 2017 11:41 PM
> To: Stefan Hajnoczi <stefa...@redhat.com>; Liu, Changpeng
> <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; fel...@nutanix.com; m...@redhat.com; Marc-
> André Lureau <marcandre.lur...@redhat.com>
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On 31/07/2017 16:51, Stefan Hajnoczi wrote:
> > typedef enum VhostUserRequest {
> > ...
> >
> > /* Submitted by the vhost-user master when the guest writes to
> >  * virtio config space and also after live migration on the
> >  * destination host.
> >  */
> > VHOST_USER_SET_CONFIG,
> >
> > /* Submitted by the vhost-user master to fetch the contents of the
> >  * virtio config space.  The vhost-user master may cache the
> >  * contents to avoid repeated VHOST_USER_GET_CONFIG calls.
> >  */
> > VHOST_USER_GET_CONFIG,
> >
> > ...
> > };
> 
> Also:
> 
>   /* Submitted by the vhost-user master if it would like to
>* be informed of virtio config space changes.   The slave
>* signals the eventfd whenever config space is modified.
>*/
>   VHOST_USER_SET_CONFIG_FD,
> 
> Paolo
> 
Thanks Stefan and Paolo, looks much cleaner now.
> > struct VuDev {
> > ...
> > int config_change_fd;
> > ...
> > };




Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-28 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Friday, July 28, 2017 6:36 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host 
> device
> 
> On Thu, Jul 27, 2017 at 10:08:49AM +, Liu, Changpeng wrote:
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > Sent: Thursday, July 27, 2017 5:49 PM
> > > To: Liu, Changpeng <changpeng@intel.com>
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> > > m...@redhat.com
> > > Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk 
> > > host
> > > device
> > >
> > > On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > > > +.fields = (VMStateField[]) {
> > > > > > +VMSTATE_VIRTIO_DEVICE,
> > > > > > +VMSTATE_END_OF_LIST()
> > > > > > +},
> > > > > > +};
> > > > > > +
> > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > > > >
> > > > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes 
> > > > > the
> > > > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > > > drive= parameter.
> > > > >
> > > > > Also, parameters like the discard granularity, optimal I/O size, 
> > > > > logical
> > > > > block size, etc can be initialized to sensible defaults by QEMU's 
> > > > > block
> > > > > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > > > > there is no way for QEMU to set good defaults.
> > > > >
> > > > > Are you relying on the managment tools populating these parameters 
> > > > > with
> > > > > the correct values from the vhost-user-blk process?  Or did you have
> > > > > some other mechanism in mind?
> > > > Actually for this part and your comments about block resize, I think 
> > > > maybe
> add
> > > several
> > > > additional vhost user messages such like:
> > > VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > > > makes the code more clear, I'm okay to add such messages.
> > >
> > > New messages for virtio config space read/write might be useful for
> > > other devices besides virtio-blk.
> > I mean all the block device information like: block_size/capacity are 
> > stored in
> another process,
> > so add a new vhost user message to Qemu vhost-user-blk can get those
> information when Qemu get
> > started, once those messages stored to virtio_pci config space, we will not
> change it.
> 
> No, I think updates are necessary:
> 
> The virtio block device can do online disk resize, so it will be
> necessary to change the capacity field from the vhost-user-blk process
> at runtime and raise a config change notification interrupt.
> 
> The virtio block device also has a config space field ("wce") that is
> writable by the guest.  Supporting this feature also requires virtio
> config space support in vhost-user.
> 
> If you focus on adding generic virtio config space messages to
> vhost-user then these virtio block features can be supported by
> vhost-user-blk.
> 
> Regarding the two approaches of adding "block device information" as you
> have suggested versus adding generic virtio config space support as I'm
> suggesting, from a protocol design perspective it's nicer to have
> generic messages that are usable by all device types.  I'm not aware of
> a reason why high-level "block device information" is needed since QEMU
> will just put that into the config space but otherwise does not
> interpret the information.
Agreed, adding a vhost message  to get/set generic vitio_pci device config 
space 
is clear to me now, it's better than only for vhost-user-blk messages.

Here I still have an concern about *resize* feature for vhost-user-blk, 
currently
Qemu vhost-user-blk is the client for vhost user messages, does this means
the I/O processing process must send a new vhost message back to Qemu
vhost-user-blk driver that the capacity of the block device is changed? Or you
have better idea to do this? Thanks.
> 
> > > It's worth checking how virtio config space is live migrated and how to
> > > do that consistently if the vhost-user process is involved.
> > Virtio will config spaces for virtio_blk, and even the config space 
> > migrated to
> another machine, it should be
> > same with the another I/O process, did I get your comments ? Thanks.
> 
> The guest-writable "wce" config space field is an example that shows the
> vhost-user-blk process on the destination host does not have all the
> state necessary.  QEMU needs to migrate the config space and send it to
> the vhost-user-blk process on the destination.
> 
> Stefan



Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-27 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Thursday, July 27, 2017 5:49 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_VIRTIO_DEVICE,
> > > > +VMSTATE_END_OF_LIST()
> > > > +},
> > > > +};
> > > > +
> > > > +static Property vhost_user_blk_properties[] = {
> > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > >
> > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
> > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > drive= parameter.
> > >
> > > Also, parameters like the discard granularity, optimal I/O size, logical
> > > block size, etc can be initialized to sensible defaults by QEMU's block
> > > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > > there is no way for QEMU to set good defaults.
> > >
> > > Are you relying on the managment tools populating these parameters with
> > > the correct values from the vhost-user-blk process?  Or did you have
> > > some other mechanism in mind?
> > Actually for this part and your comments about block resize, I think maybe 
> > add
> several
> > additional vhost user messages such like:
> VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > makes the code more clear, I'm okay to add such messages.
> 
> New messages for virtio config space read/write might be useful for
> other devices besides virtio-blk.
I mean all the block device information like: block_size/capacity are stored in 
another process, 
so add a new vhost user message to Qemu vhost-user-blk can get those 
information when Qemu get
started, once those messages stored to virtio_pci config space, we will not 
change it.
> 
> It's worth checking how virtio config space is live migrated and how to
> do that consistently if the vhost-user process is involved.
Virtio will config spaces for virtio_blk, and even the config space migrated to 
another machine, it should be
same with the another I/O process, did I get your comments ? Thanks. 
> 
> Stefan



Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-26 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, July 26, 2017 6:35 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On Thu, Jul 27, 2017 at 10:00:50AM +0800, Changpeng Liu wrote:
> > +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t
> *config)
> > +{
> > +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > +struct virtio_blk_config blkcfg;
> > +
> > +memcpy(, config, sizeof(blkcfg));
> > +
> > +if (blkcfg.wce != s->config_wce) {
> > +error_report("vhost-user-blk: does not support the operation");
> 
> If vhost-user-blk doesn't support the operation then please remove the
> VIRTIO_BLK_F_CONFIG_WCE feature bit.  That way the guest knows it cannot
> modify this field.
> 
> > +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> > +{
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > +int ret;
> > +uint64_t size;
> > +
> > +if (!s->chardev.chr) {
> > +error_setg(errp, "vhost-user-blk: chardev is mandatary");
> 
> mandatory
> 
> > +return;
> > +}
> > +
> > +if (!s->num_queues) {
> > +error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> > +return;
> > +}
> > +
> > +if (!s->queue_size) {
> > +error_setg(errp, "vhost-user-blk: invalid count of the IO queue");
> 
> "count of the IO queue" sounds like number of queues.  I suggest saying
> "queue size must be non-zero".
> 
> > +return;
> > +}
> > +
> > +if (!s->size) {
> > +error_setg(errp, "vhost-user-blk: block capacity must be assigned,"
> > +  "size can be specified by GiB or MiB");
> > +return;
> > +}
> > +
> > +ret = qemu_strtosz_MiB(s->size, NULL, );
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-user-blk: invalid size %s in GiB/MiB", 
> > s->size);
> > +return;
> > +}
> > +s->capacity = size / 512;
> > +
> > +/* block size with default 512 Bytes */
> > +if (!s->blkcfg.logical_block_size) {
> > +s->blkcfg.logical_block_size = 512;
> > +}
> > +
> > +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > +sizeof(struct virtio_blk_config));
> > +virtio_add_queue(vdev, s->queue_size, vhost_user_blk_handle_output);
> > +
> > +s->dev.nvqs = s->num_queues;
> > +s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> 
> Please test multi-queue, it's currently broken.  virtio_add_queue() must
> be called for each queue.
Yes, should move virtio_add_queue into loop.
> 
> > +s->dev.vq_index = 0;
> > +s->dev.backend_features = 0;
> > +
> > +ret = vhost_dev_init(>dev, (void *)>chardev,
> 
> The compiler automatically converts any pointer type to void * without a
> warning in C.  (This is different from C++!)
> 
> The (void *) cast can be dropped.
> 
> > + VHOST_BACKEND_TYPE_USER, 0);
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> > +   strerror(-ret));
> 
> If realize fails unrealize() will not be called.  Cleanup must be
> performed here.
> 
> > +return;
> > +}
> > +}
> > +
> > +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> > +{
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VHostUserBlk *s = VHOST_USER_BLK(dev);
> > +
> > +vhost_user_blk_set_status(vdev, 0);
> > +vhost_dev_cleanup(>dev);
> > +g_free(s->dev.vqs);
> > +virtio_cleanup(vdev);
> > +}
> > +
> > +static void vhost_user_blk_instance_init(Object *obj)
> > +{
> > +VHostUserBlk *s = VHOST_USER_BLK(obj);
> > +
> > +device_add_bootindex_property(obj, >bootindex, "bootindex",
> > +  "/disk@0,0", DEVICE(obj), NULL);
> > +}
> > +
> > +static const VMStateDescription vmstate_vhost_user_blk = {
> > +.

Re: [Qemu-devel] [PATCH] virtio-blk: add DISCARD support to virtio-blk driver

2017-03-27 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, March 28, 2017 4:20 AM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: virtio-...@lists.oasis-open.org; 
> virtualizat...@lists.linux-foundation.org; linux-
> ker...@vger.kernel.org; h...@lst.de; qemu-devel@nongnu.org
> Subject: Re: [PATCH] virtio-blk: add DISCARD support to virtio-blk driver
> 
> On Tue, Mar 28, 2017 at 04:39:25PM +0800, Changpeng Liu wrote:
> > Currently virtio-blk driver does not provide discard feature flag, so the
> > filesystems which built on top of the block device will not send discard
> > command. This is okay for HDD backend, but it will impact the performance
> > for SSD backend.
> >
> > Add a feature flag VIRTIO_BLK_F_DISCARD and command
> VIRTIO_BLK_T_DISCARD
> > to extend exist virtio-blk protocol. virtio-blk protocol uses a single
> > 8 bytes descriptor containing type,reserved and sector, currently Linux
> > uses the reserved field as IO priority, here we also re-use the reserved
> > field as number of discard sectors.
> >
> > Signed-off-by: Changpeng Liu <changpeng@intel.com>
> > ---
> >  drivers/block/virtio_blk.c  | 38 +-
> >  include/uapi/linux/virtio_blk.h | 12 ++--
> >  2 files changed, 39 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index 1d4c9f8..550cfe7 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c
> > @@ -241,6 +241,9 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
> > case REQ_OP_FLUSH:
> > type = VIRTIO_BLK_T_FLUSH;
> > break;
> > +   case REQ_OP_DISCARD:
> > +   type = VIRTIO_BLK_T_DISCARD;
> > +   break;
> > case REQ_OP_SCSI_IN:
> > case REQ_OP_SCSI_OUT:
> > type = VIRTIO_BLK_T_SCSI_CMD;
> > @@ -256,16 +259,24 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
> > vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, type);
> > vbr->out_hdr.sector = type ?
> > 0 : cpu_to_virtio64(vblk->vdev, blk_rq_pos(req));
> > -   vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(req));
> > +   vbr->out_hdr.u.ioprio = cpu_to_virtio32(vblk->vdev, 
> > req_get_ioprio(req));
> >
> > blk_mq_start_request(req);
> >
> > -   num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
> > -   if (num) {
> > -   if (rq_data_dir(req) == WRITE)
> > -   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev,
> VIRTIO_BLK_T_OUT);
> > -   else
> > -   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev,
> VIRTIO_BLK_T_IN);
> > +   if (type == VIRTIO_BLK_T_DISCARD) {
> > +   vbr->out_hdr.u.discard_nr_sectors = cpu_to_virtio32(vblk->vdev,
> > +
> blk_rq_sectors(req));
> > +   num = 0;
> > +   } else {
> > +   num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
> > +   if (num) {
> > +   if (rq_data_dir(req) == WRITE)
> > +   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev,
> > +
> VIRTIO_BLK_T_OUT);
> > +   else
> > +   vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev,
> > +
> VIRTIO_BLK_T_IN);
> > +   }
> > }
> >
> > spin_lock_irqsave(>vqs[qid].lock, flags);
> > @@ -775,6 +786,15 @@ static int virtblk_probe(struct virtio_device *vdev)
> > if (!err && opt_io_size)
> > blk_queue_io_opt(q, blk_size * opt_io_size);
> >
> > +   if (virtio_has_feature(vdev, VIRTIO_BLK_F_DISCARD)) {
> > +   q->limits.discard_zeroes_data = 0;
> > +   q->limits.discard_alignment = blk_size;
> > +   q->limits.discard_granularity = blk_size;
> > +   blk_queue_max_discard_sectors(q, UINT_MAX);
> > +   blk_queue_max_discard_segments(q, 1);
> > +   queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
> > +   }
> 
> Please add configuration space fields for these limits.  Looking at the
> virtio-scsi block limits code in QEMU's scsi_disk_emulate_inquiry() I
> can see that the hypervisor has useful values that it wants to
> communicate.  They shouldn't be hardcoded to blk_size.
Yes, move discard related parameters to configuration space make sense.
> 
> > +
> > virtio_device_ready(vdev);
> >
> > device_add_dis

Re: [Qemu-devel] [virtio-dev] [PATCH] virtio-blk: add DISCARD support to virtio-blk driver

2017-03-27 Thread Liu, Changpeng


> -Original Message-
> From: virtio-...@lists.oasis-open.org 
> [mailto:virtio-...@lists.oasis-open.org] On
> Behalf Of Paolo Bonzini
> Sent: Monday, March 27, 2017 7:34 PM
> To: Liu, Changpeng <changpeng@intel.com>; virtio-...@lists.oasis-open.org;
> virtualizat...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; 
> h...@lst.de
> Cc: qemu-devel@nongnu.org
> Subject: Re: [virtio-dev] [PATCH] virtio-blk: add DISCARD support to 
> virtio-blk driver
> 
> 
> 
> On 28/03/2017 10:39, Changpeng Liu wrote:
> > +   if (virtio_has_feature(vdev, VIRTIO_BLK_F_DISCARD)) {
> > +   q->limits.discard_zeroes_data = 0;
> 
> Maybe you could use another feature bit to populate discard_zeroes_data.
> 
> Paolo
> 
Sounds good to me, Christoph Hellwig mentioned this field will be removed in 
next release, just removed this line makes clear.
> > +   q->limits.discard_alignment = blk_size;
> > +   q->limits.discard_granularity = blk_size;
> > +   blk_queue_max_discard_sectors(q, UINT_MAX);
> > +   blk_queue_max_discard_segments(q, 1);
> > +   queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
> > +   }
> > +
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [Qemu-devel] [PATCH] virtio-blk: add DISCARD support to virtio-blk driver

2017-03-27 Thread Liu, Changpeng


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Monday, March 27, 2017 10:56 PM
> To: Liu, Changpeng <changpeng@intel.com>
> Cc: virtio-...@lists.oasis-open.org; 
> virtualizat...@lists.linux-foundation.org; linux-
> ker...@vger.kernel.org; h...@lst.de; qemu-devel@nongnu.org
> Subject: Re: [PATCH] virtio-blk: add DISCARD support to virtio-blk driver
> 
> On Tue, Mar 28, 2017 at 04:39:25PM +0800, Changpeng Liu wrote:
> > Currently virtio-blk driver does not provide discard feature flag, so the
> > filesystems which built on top of the block device will not send discard
> > command. This is okay for HDD backend, but it will impact the performance
> > for SSD backend.
> >
> > Add a feature flag VIRTIO_BLK_F_DISCARD and command
> VIRTIO_BLK_T_DISCARD
> > to extend exist virtio-blk protocol. virtio-blk protocol uses a single
> > 8 bytes descriptor containing type,reserved and sector, currently Linux
> > uses the reserved field as IO priority, here we also re-use the reserved
> > field as number of discard sectors.
> 
> Do you have a link to the specification for this feature?  At least
> virtio-v1.0 does not seem to specify a discard feature.
Not yet, patch goes first, there are uses who don't want to migrate from 
virtio-blk to virtio-scsi,
so this feature seems reasonable, the question is that how we should define the 
specification to
support this feature.
> 
> Note that Linux 4.11 and later have support for multi-range discard
> ala ATA TRIM, SCSI UNMAP and NVMe deallocate which might be useful
> here, too.
For support multi-range discard features, the DISCARD command must have data_in 
buffer, similar with
SCSI UNMAP and NVMe DEALLOCATE commands, maybe 16 bytes aligned descriptors are 
required for each
DISCARD command.
> 
> > +   q->limits.discard_zeroes_data = 0;
> 
> No need to clear this.  Also hopefully this field goes away for 4.12
> 
> > +   blk_queue_max_discard_segments(q, 1);
> 
> No need to set this.