from:"Paolo Bonzini"

Re: [PATCH V2 6/8] scsi: virtio_scsi: fix IO hang by irq vector automatic affinity

2018-02-05 Thread Paolo Bonzini

On 05/02/2018 16:20, Ming Lei wrote:
> Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
> has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
> for some irq vectors, this can't be avoided even though the allocation
> is improved.
> 
> For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of
> virtio-scsi, the irq affinity assigned can become the following shape:
> 
>   irq 36, cpu list 0-7
>   irq 37, cpu list 0-7
>   irq 38, cpu list 0-7
>   irq 39, cpu list 0-1
>   irq 40, cpu list 4,6
>   irq 41, cpu list 2-3
>   irq 42, cpu list 5,7
> 
> Then IO hang is triggered in case of non-SCSI_MQ.
> 
> Given storage IO is always C/S model, there isn't such issue with 
> SCSI_MQ(blk-mq),
> because no IO can be submitted to one hw queue if the hw queue hasn't online
> CPUs.
> 
> Fix this issue by forcing to use blk_mq.
> 
> BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has
> been quite stable, so it shouldn't cause extra risk.

I think that's ok now that we have I/O schedulers for blk-mq.

Acked-by: Paolo Bonzini <pbonz...@redhat.com>

Paolo

> Cc: Arun Easi <arun.e...@cavium.com>
> Cc: Omar Sandoval <osan...@fb.com>,
> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>,
> Cc: James Bottomley <james.bottom...@hansenpartnership.com>,
> Cc: Christoph Hellwig <h...@lst.de>,
> Cc: Don Brace <don.br...@microsemi.com>
> Cc: Kashyap Desai <kashyap.de...@broadcom.com>
> Cc: Peter Rivera <peter.riv...@broadcom.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Mike Snitzer <snit...@redhat.com>
> Reviewed-by: Hannes Reinecke <h...@suse.de>
> Tested-by: Laurence Oberman <lober...@redhat.com>
> Signed-off-by: Ming Lei <ming@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 59 
> +++---
>  1 file changed, 3 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 7c28e8d4955a..54e3a0f6844c 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -91,9 +91,6 @@ struct virtio_scsi_vq {
>  struct virtio_scsi_target_state {
>   seqcount_t tgt_seq;
>  
> - /* Count of outstanding requests. */
> - atomic_t reqs;
> -
>   /* Currently active virtqueue for requests sent to this target. */
>   struct virtio_scsi_vq *req_vq;
>  };
> @@ -152,8 +149,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi 
> *vscsi, void *buf)
>   struct virtio_scsi_cmd *cmd = buf;
>   struct scsi_cmnd *sc = cmd->sc;
>   struct virtio_scsi_cmd_resp *resp = >resp.cmd;
> - struct virtio_scsi_target_state *tgt =
> - scsi_target(sc->device)->hostdata;
>  
>   dev_dbg(>device->sdev_gendev,
>   "cmd %p response %u status %#02x sense_len %u\n",
> @@ -210,8 +205,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi 
> *vscsi, void *buf)
>   }
>  
>   sc->scsi_done(sc);
> -
> - atomic_dec(>reqs);
>  }
>  
>  static void virtscsi_vq_done(struct virtio_scsi *vscsi,
> @@ -580,10 +573,7 @@ static int virtscsi_queuecommand_single(struct Scsi_Host 
> *sh,
>   struct scsi_cmnd *sc)
>  {
>   struct virtio_scsi *vscsi = shost_priv(sh);
> - struct virtio_scsi_target_state *tgt =
> - scsi_target(sc->device)->hostdata;
>  
> - atomic_inc(>reqs);
>   return virtscsi_queuecommand(vscsi, >req_vqs[0], sc);
>  }
>  
> @@ -596,55 +586,11 @@ static struct virtio_scsi_vq 
> *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
>   return >req_vqs[hwq];
>  }
>  
> -static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi,
> -struct virtio_scsi_target_state 
> *tgt)
> -{
> - struct virtio_scsi_vq *vq;
> - unsigned long flags;
> - u32 queue_num;
> -
> - local_irq_save(flags);
> - if (atomic_inc_return(>reqs) > 1) {
> - unsigned long seq;
> -
> - do {
> - seq = read_seqcount_begin(>tgt_seq);
> - vq = tgt->req_vq;
> - } while (read_seqcount_retry(>tgt_seq, seq));
> - } else {
> - /* no writes can be concurrent because of atomic_t */
> - write_seqcount_begin(>tgt_seq);
> -
> - /* keep previous req_vq if a reader just arrived */
> - if (unlikely(atomic_read(>reqs) > 1))

Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.

2017-08-11 Thread Paolo Bonzini

On 11/08/2017 19:23, Michael S. Tsirkin wrote:
> On Fri, Aug 11, 2017 at 04:09:26PM +0200, Paolo Bonzini wrote:
>> On 10/08/2017 23:41, Michael S. Tsirkin wrote:
>>>>> Then we probably should fail probe if vq size is too small.
>>>> What does this mean?
>>>
>>> We must prevent driver from submitting s/g lists > vq size to device.
>>
>> What is the rationale for the limit?
> 
> So the host knows what it needs to support.
> 
>> both virtio-blk and virtio-scsi transmit their own value for the
>> maximum sg list size (max_seg in virtio-scsi, seg_max in virtio-blk).
> 
> No other device has it, and it seemed like a good idea to
> limit it generally at the time.
> 
> we can fix the spec to relax the requirement for blk and scsi -
> want to submit a proposal? Alternatively, add a generic field
> for that.

Yes, I can submit a proposal.  blk and scsi are the ones that are most
likely to have very long sg lists.

When I was designing scsi I just copied that field from blk. :)

Paolo

Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.

2017-08-11 Thread Paolo Bonzini

On 10/08/2017 23:41, Michael S. Tsirkin wrote:
>>> Then we probably should fail probe if vq size is too small.
>> What does this mean?
> 
> We must prevent driver from submitting s/g lists > vq size to device.

What is the rationale for the limit?  It makes no sense if indirect
descriptors are available, especially because...

> Either tell linux to avoid s/g lists that are too long, or
> simply fail request if this happens, or refuse to attach driver to device.
> 
> Later option would look something like this within probe:
> 
> for (i = VIRTIO_SCSI_VQ_BASE; i < num_vqs; i++)
>   if (vqs[i]->num < MAX_SG_USED_BY_LINUX)
>   goto err;
> 
> 
> I don't know what's MAX_SG_USED_BY_LINUX though.
> 

... both virtio-blk and virtio-scsi transmit their own value for the
maximum sg list size (max_seg in virtio-scsi, seg_max in virtio-blk).

Paolo

Re: [PATCH v2 2/2] virtio: virtio_scsi: Set can_queue to the length of the virtqueue.

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 18:56, Richard W.M. Jones wrote:
> Since switching to blk-mq as the default in commit 5c279bd9e406
> ("scsi: default to scsi-mq"), virtio-scsi LUNs consume about 10x as
> much kernel memory.
> 
> qemu currently allocates a fixed 128 entry virtqueue.  can_queue
> currently is set to 1024.  But with indirect descriptors, each command
> in the queue takes 1 virtqueue entry, so the number of commands which
> can be queued is equal to the length of the virtqueue.
> 
> Note I intend to send a patch to qemu to allow the virtqueue size to
> be configured from the qemu command line.
> 
> Thanks Paolo Bonzini, Christoph Hellwig.
> 
> Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..7c28e8d4955a 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -818,7 +818,6 @@ static struct scsi_host_template 
> virtscsi_host_template_single = {
>   .eh_timed_out = virtscsi_eh_timed_out,
>   .slave_alloc = virtscsi_device_alloc,
>  
> - .can_queue = 1024,
>   .dma_boundary = UINT_MAX,
>   .use_clustering = ENABLE_CLUSTERING,
>   .target_alloc = virtscsi_target_alloc,
> @@ -839,7 +838,6 @@ static struct scsi_host_template 
> virtscsi_host_template_multi = {
>   .eh_timed_out = virtscsi_eh_timed_out,
>   .slave_alloc = virtscsi_device_alloc,
>  
> - .can_queue = 1024,
>   .dma_boundary = UINT_MAX,
>   .use_clustering = ENABLE_CLUSTERING,
>   .target_alloc = virtscsi_target_alloc,
> @@ -972,6 +970,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
>   if (err)
>   goto virtscsi_init_failed;
>  
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
>   cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
>   shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
>   shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x;
> 

Acked-by: Paolo Bonzini <pbonz...@redhat.com>

Re: [PATCH 2/2] virtio: virtio_scsi: Set can_queue to the length of the virtqueue.

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 18:40, Richard W.M. Jones wrote:
> Since switching to blk-mq as the default in commit 5c279bd9e406
> ("scsi: default to scsi-mq"), virtio-scsi LUNs consume about 10x as
> much kernel memory.
> 
> qemu currently allocates a fixed 128 entry virtqueue.  can_queue
> currently is set to 1024.  But with indirect descriptors, each command
> in the queue takes 1 virtqueue entry, so the number of commands which
> can be queued is equal to the length of the virtqueue.
> 
> Note I intend to send a patch to qemu to allow the virtqueue size to
> be configured from the qemu command line.

You can clear .can_queue from the templates.  Otherwise looks good.

Paolo

> Thanks Paolo Bonzini, Christoph Hellwig.
> 
> Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..d6b4ff634c0d 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
>   if (err)
>   goto virtscsi_init_failed;
>  
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
>   cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
>   shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
>   shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x;
>

Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 18:40, Richard W.M. Jones wrote:
> If using indirect descriptors, you can make the total_sg as large as
> you want.  If not, BUG is too serious because the function later
> returns -ENOSPC.
> 
> Thanks Paolo Bonzini, Christoph Hellwig.
> 
> Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5e1b548828e6..27cbc1eab868 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>   }
>  #endif
>  
> - BUG_ON(total_sg > vq->vring.num);
>   BUG_ON(total_sg == 0);
>  
>   head = vq->free_head;
> @@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>* buffers, then go indirect. FIXME: tune this threshold */
>   if (vq->indirect && total_sg > 1 && vq->vq.num_free)
>   desc = alloc_indirect(_vq, total_sg, gfp);
> - else
> + else {
>   desc = NULL;
> + WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);

So we get here only if vq->vq.num_free is zero.  In that case,
vq->indirect makes no difference for the appropriateness of the warning
(that is, it should never be issued for indirect descriptors).

> + }
>  
>   if (desc) {
>   /* Use a single buffer which doesn't continue */
> 


Reviewed-by: Paolo Bonzini <pbonz...@redhat.com>

Paolo

Re: Increased memory usage with scsi-mq

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 17:40, Richard W.M. Jones wrote:
> OK this is looking a bit better now.
> 
>   With scsi-mq enabled:   175 disks
>   virtqueue_size=64:  318 disks *
>   virtqueue_size=16:  775 disks *
>   With scsi-mq disabled: 1755 disks
> * = new results
> 
> I also ran the whole libguestfs test suite with virtqueue_size=16
> (with no failures shown).  As this tests many different disk I/O
> operations, it gives me some confidence that things generally work.

Yes, it looks good.  I'll grab you on IRC to discuss the commit message.

Paolo

> Do you have any other comments about the patches?  I'm not sure I know
> enough to write an intelligent commit message for the kernel patch.
> 
> Rich.

> 
> --- kernel patch ---
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..d6b4ff634c0d 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
>   if (err)
>   goto virtscsi_init_failed;
>  
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
>   cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
>   shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
>   shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x;
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5e1b548828e6..2d7509da9f39 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>   }
>  #endif
>  
> - BUG_ON(total_sg > vq->vring.num);
>   BUG_ON(total_sg == 0);
>  
>   head = vq->free_head;
> @@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>* buffers, then go indirect. FIXME: tune this threshold */
>   if (vq->indirect && total_sg > 1 && vq->vq.num_free)
>   desc = alloc_indirect(_vq, total_sg, gfp);
> - else
> + else {
>   desc = NULL;
> + WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);
> +}
>  
>   if (desc) {
>   /* Use a single buffer which doesn't continue */
> 
> --- qemu patch ---
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index eb639442d1..aadd99aad1 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
>  s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
>  s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
>  
> -s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
> -s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
> +s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
> +s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
>  for (i = 0; i < s->conf.num_queues; i++) {
> -s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
> +s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
>  }
>  }
>  
> @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState 
> *dev, Error **errp)
>  
>  static Property virtio_scsi_properties[] = {
>  DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 
> 1),
> +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, 
> parent_obj.conf.virtqueue_size, 128),
>  DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, 
> parent_obj.conf.max_sectors,
>0x),
>  DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, 
> parent_obj.conf.cmd_per_lun,
> diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
> index de6ae5a9f6..e30a92d3e7 100644
> --- a/include/hw/virtio/virtio-scsi.h
> +++ b/include/hw/virtio/virtio-scsi.h
> @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;
>  
>  struct VirtIOSCSIConf {
>  uint32_t num_queues;
> +uint32_t virtqueue_size;
>  uint32_t max_sectors;
>  uint32_t cmd_per_lun;
>  #ifdef CONFIG_VHOST_SCSI
> 
> 
>

Re: Increased memory usage with scsi-mq

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 16:16, Richard W.M. Jones wrote:
> On Thu, Aug 10, 2017 at 02:53:58PM +0200, Paolo Bonzini wrote:
>> can_queue and cmd_per_lun are different.  can_queue should be set to the
>> value of vq->vring.num where vq is the command virtqueue (the first one
>> is okay if there's >1).
>>
>> If you want to change it, you'll have to do so in QEMU.
> 
> Here are a couple more patches I came up with, the first to Linux,
> the second to qemu.
> 
> There are a few problems ...
> 
> (1) Setting virtqueue_size to < 128 causes a BUG_ON failure in
> virtio_ring.c:virtqueue_add at:
> 
> BUG_ON(total_sg > vq->vring.num);

That bug is bogus.  You can change it to

WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);

and move it in the "else" here:

if (vq->indirect && total_sg > 1 && vq->vq.num_free)
desc = alloc_indirect(_vq, total_sg, gfp);
else
desc = NULL;

You just get a -ENOSPC after the WARN, so no need to crash the kernel!

> (2) Can/should the ctrl, event and cmd queue sizes be set to the same
> values?  Or should ctrl & event be left at 128?

It's okay if they're all the same.

> (3) It seems like it might be a problem that virtqueue_size is not
> passed through the virtio_scsi_conf struct (which is ABI between the
> kernel and qemu AFAICT and so not easily modifiable).  However I think
> this is not a problem because virtqueue size is stored in the
> Virtqueue Descriptor table, which is how the kernel gets the value.
> Is that right?

Yes.

Paolo

> Rich.
> 
> 
> --- kernel patch ---
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..d6b4ff634c0d 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
>   if (err)
>   goto virtscsi_init_failed;
>  
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
>   cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
>   shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
>   shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x;
> 
> 
> --- qemu patch ---
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index eb639442d1..aadd99aad1 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
>  s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
>  s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
>  
> -s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
> -s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
> +s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
> +s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
>  for (i = 0; i < s->conf.num_queues; i++) {
> -s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
> +s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
>  }
>  }
>  
> @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState 
> *dev, Error **errp)
>  
>  static Property virtio_scsi_properties[] = {
>  DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 
> 1),
> +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, 
> parent_obj.conf.virtqueue_size, 128),
>  DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, 
> parent_obj.conf.max_sectors,
>0x),
>  DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, 
> parent_obj.conf.cmd_per_lun,
> diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
> index de6ae5a9f6..e30a92d3e7 100644
> --- a/include/hw/virtio/virtio-scsi.h
> +++ b/include/hw/virtio/virtio-scsi.h
> @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;
>  
>  struct VirtIOSCSIConf {
>  uint32_t num_queues;
> +uint32_t virtqueue_size;
>  uint32_t max_sectors;
>  uint32_t cmd_per_lun;
>  #ifdef CONFIG_VHOST_SCSI
> 
>

Re: Increased memory usage with scsi-mq

2017-08-10 Thread Paolo Bonzini

On 10/08/2017 14:22, Richard W.M. Jones wrote:
> On Wed, Aug 09, 2017 at 06:50:10PM +0200, Paolo Bonzini wrote:
>> On 09/08/2017 18:01, Christoph Hellwig wrote:
>>> On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
>>>> can_queue should depend on the virtqueue size, which unfortunately can
>>>> vary for each virtio-scsi device in theory.  The virtqueue size is
>>>> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
>>>> in QEMU it is the second argument to virtio_add_queue.
>>>
>>> Why is that unfortunate?  We don't even have to set can_queue in
>>> the host template, we can dynamically set it on per-host basis.
>>
>> Ah, cool, I thought allocations based on can_queue happened already in
>> scsi_host_alloc, but they happen at scsi_add_host time.
> 
> I think I've decoded all that information into the patch below.
> 
> I tested it, and it appears to work: when I set cmd_per_lun on the
> qemu command line, I see that the guest can add more disks:
> 
>   With scsi-mq enabled:   175 disks
>   cmd_per_lun not set:177 disks  *
>   cmd_per_lun=16: 776 disks  *
>   cmd_per_lun=4: 1160 disks  *
>   With scsi-mq disabled: 1755 disks
>  * = new result
> 
> From my point of view, this is a good result, but you should be warned
> that I don't fully understand what's going on here and I may have made
> obvious or not-so-obvious mistakes.

can_queue and cmd_per_lun are different.  can_queue should be set to the
value of vq->vring.num where vq is the command virtqueue (the first one
is okay if there's >1).

If you want to change it, you'll have to do so in QEMU.

Paolo

> I tested the performance impact and it's not noticable in the
> libguestfs case even with very small cmd_per_lun settings, but
> libguestfs is largely serial and so this result won't be applicable to
> guests in general.
> 
> Also, should the guest kernel validate cmd_per_lun to make sure it's
> not too small or large?  And if so, what would the limits be?
> 
> Rich.
> 
> From e923e49846189b2f55f3f02b70a290d4be237ed5 Mon Sep 17 00:00:00 2001
> From: "Richard W.M. Jones" <rjo...@redhat.com>
> Date: Thu, 10 Aug 2017 12:21:47 +0100
> Subject: [PATCH] scsi: virtio_scsi: Set can_queue based on cmd_per_lun passed
>  by hypervisor.
> 
> Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..b22591e9b16b 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -973,7 +973,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
>   goto virtscsi_init_failed;
>  
>   cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> - shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
> + shost->cmd_per_lun = shost->can_queue = cmd_per_lun;
>   shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x;
>  
>   /* LUNs > 256 are reported with format 1, so they go in the range
>

Re: Increased memory usage with scsi-mq

2017-08-09 Thread Paolo Bonzini

On 09/08/2017 18:01, Christoph Hellwig wrote:
> On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
>> can_queue should depend on the virtqueue size, which unfortunately can
>> vary for each virtio-scsi device in theory.  The virtqueue size is
>> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
>> in QEMU it is the second argument to virtio_add_queue.
> 
> Why is that unfortunate?  We don't even have to set can_queue in
> the host template, we can dynamically set it on per-host basis.

Ah, cool, I thought allocations based on can_queue happened already in
scsi_host_alloc, but they happen at scsi_add_host time.

Paolo

Re: Increased memory usage with scsi-mq

2017-08-07 Thread Paolo Bonzini

On 07/08/2017 14:27, Richard W.M. Jones wrote:
> Also I noticed this code in virtio_scsi.c:
> 
> cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
> 
> but setting cmd_per_lun (as a qemu virtio-scsi-pci parameter) didn't
> seem to make any difference to memory usage.

That's because memory depends on shost->can_queue, not
shost->cmd_per_lun (see scsi_mq_setup_tags).

can_queue should depend on the virtqueue size, which unfortunately can
vary for each virtio-scsi device in theory.  The virtqueue size is
retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
in QEMU it is the second argument to virtio_add_queue.

For virtio-scsi this is VIRTIO_SCSI_VQ_SIZE, which is 128, so changing
can_queue to 128 should be a no brainer.

Thanks,

Paolo

Re: Increased memory usage with scsi-mq

2017-08-07 Thread Paolo Bonzini

On 05/08/2017 17:51, Richard W.M. Jones wrote:
> On Sat, Aug 05, 2017 at 03:39:54PM +0200, Christoph Hellwig wrote:
>> For now can you apply this testing patch to the guest kernel?
>>
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 9be211d68b15..0cbe2c882e1c 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -818,7 +818,7 @@ static struct scsi_host_template 
>> virtscsi_host_template_single = {
>>  .eh_timed_out = virtscsi_eh_timed_out,
>>  .slave_alloc = virtscsi_device_alloc,
>>  
>> -.can_queue = 1024,
>> +.can_queue = 64,
>>  .dma_boundary = UINT_MAX,
>>  .use_clustering = ENABLE_CLUSTERING,
>>  .target_alloc = virtscsi_target_alloc,
>> @@ -839,7 +839,7 @@ static struct scsi_host_template 
>> virtscsi_host_template_multi = {
>>  .eh_timed_out = virtscsi_eh_timed_out,
>>  .slave_alloc = virtscsi_device_alloc,
>>  
>> -.can_queue = 1024,
>> +.can_queue = 64,
>>  .dma_boundary = UINT_MAX,
>>  .use_clustering = ENABLE_CLUSTERING,
>>  .target_alloc = virtscsi_target_alloc,
>> @@ -983,7 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
>>  shost->max_id = num_targets;
>>  shost->max_channel = 0;
>>  shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
>> -shost->nr_hw_queues = num_queues;
>> +shost->nr_hw_queues = 1;
>>  
>>  #ifdef CONFIG_BLK_DEV_INTEGRITY
>>  if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {
> 
> Yes, that's an improvement, although it's still a little way off the
> density possible the old way:
> 
>   With scsi-mq enabled:   175 disks
> * With this patch:319 disks *
>   With scsi-mq disabled: 1755 disks
> 
> Also only the first two hunks are necessary.  The kernel behaves
> exactly the same way with or without the third hunk (ie. num_queues
> must already be 1).
> 
> Can I infer from this that qemu needs a way to specify the can_queue
> setting to the virtio-scsi driver in the guest kernel?

You could also add a module parameter to the driver, and set it to 64 on
the kernel command line (there is an example in
drivers/scsi/vmw_pvscsi.c of how to do it).

Paolo

[PATCH] virtio_scsi: always read VPD pages for multiqueue too

2017-07-05 Thread Paolo Bonzini

Multi-queue virtio-scsi uses a different scsi_host_template struct.
Add the .device_alloc field there, too.

Fixes: 25d1d50e23275e141e3a3fe06c25a99f4c4bf4e0
Cc: sta...@vger.kernel.org
Cc: David Gibson <da...@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index f8dbfeee6c63..ad1e7f1aba4c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -826,6 +826,7 @@ static int virtscsi_map_queues(struct Scsi_Host *shost)
.change_queue_depth = virtscsi_change_queue_depth,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
+   .slave_alloc = virtscsi_device_alloc,
 
.can_queue = 1024,
.dma_boundary = UINT_MAX,
-- 
1.8.3.1

[PATCH] virtio_scsi: let host do exception handling

2017-06-21 Thread Paolo Bonzini

virtio_scsi tries to do exception handling after the default
30 seconds timeout expires.  However, it's better to let the host
control the timeout, otherwise with a heavy I/O load it is
likely that an abort will also timeout.  This leads to fatal
errors like filesystems going offline.

Disable the 'sd' timeout and allow the host to do exception
handling, following the precedent of the storvsc driver.

Hannes has a proposal to introduce timeouts in virtio, but
this provides an immediate solution for stable kernels too.

Reported-by: Douglas Miller <dougm...@linux.vnet.ibm.com>
Cc: "James E.J. Bottomley" <j...@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>
Cc: Hannes Reinecke <h...@suse.de>
Cc: linux-scsi@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index f8dbfeee6c63..55d344ea6869 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -796,6 +796,16 @@ static int virtscsi_map_queues(struct Scsi_Host *shost)
return blk_mq_virtio_map_queues(>tag_set, vscsi->vdev, 2);
 }
 
+/*
+ * The host guarantees to respond to each command, although I/O latencies might
+ * be higher than on bare meta.  Reset the timer unconditionally to give the
+ * host a chance to perform EH.
+ */
+static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
+{
+   return BLK_EH_RESET_TIMER;
+}
+
 static struct scsi_host_template virtscsi_host_template_single = {
.module = THIS_MODULE,
.name = "Virtio SCSI HBA",
@@ -806,6 +816,7 @@ static struct scsi_host_template 
virtscsi_host_template_single = {
.change_queue_depth = virtscsi_change_queue_depth,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
+   .eh_timed_out = virtscsi_eh_timed_out,
.slave_alloc = virtscsi_device_alloc,
 
.can_queue = 1024,
@@ -826,6 +837,7 @@ static struct scsi_host_template 
virtscsi_host_template_multi = {
.change_queue_depth = virtscsi_change_queue_depth,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
+   .eh_timed_out = virtscsi_eh_timed_out,
 
.can_queue = 1024,
.dma_boundary = UINT_MAX,
-- 
2.13.0

Re: [PATCH] virtio_scsi: Always try to read VPD pages

2017-04-19 Thread Paolo Bonzini



On 13/04/2017 15:39, Stefan Hajnoczi wrote:
> On Thu, Apr 13, 2017 at 12:13:00PM +1000, David Gibson wrote:
>> @@ -705,6 +706,28 @@ static int virtscsi_device_reset(struct scsi_cmnd *sc)
>>  return virtscsi_tmf(vscsi, cmd);
>>  }
>>  
>> +static int virtscsi_device_alloc(struct scsi_device *sdevice)
>> +{
>> +/*
>> + * Passed through SCSI targets (e.g. with qemu's 'scsi-block')
>> + * may have transfer limits which come from the host SCSI
>> + * controller something on the host side other than the target
> 
> s/controller something/controller or something/ ?

Acked-by: Paolo Bonzini <pbonz...@redhat.com>

with this change.

Paolo

Re: [PATCH 26/27] scsi: sd: Separate zeroout and discard command choices

2017-04-19 Thread Paolo Bonzini



On 05/04/2017 19:21, Christoph Hellwig wrote:
> +static ssize_t
> +zeroing_mode_store(struct device *dev, struct device_attribute *attr,
> +const char *buf, size_t count)
> +{
> + struct scsi_disk *sdkp = to_scsi_disk(dev);
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EACCES;
> +
> +
> + if (!strncmp(buf, zeroing_mode[SD_ZERO_WRITE], 20))
> + sdkp->zeroing_mode = SD_ZERO_WRITE;
> + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS], 20))
> + sdkp->zeroing_mode = SD_ZERO_WS;
> + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS16_UNMAP], 20))
> + sdkp->zeroing_mode = SD_ZERO_WS16_UNMAP;
> + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS10_UNMAP], 20))
> + sdkp->zeroing_mode = SD_ZERO_WS10_UNMAP;

Should this be conditional on lbprz, lbpws, lbpws10 and max_ws_blocks?

Thanks,

Paolo

Re: [PATCH 03/23] sd: implement REQ_OP_WRITE_ZEROES

2017-03-29 Thread Paolo Bonzini



On 29/03/2017 18:28, Bart Van Assche wrote:
> On Wed, 2017-03-29 at 16:51 +0200, Paolo Bonzini wrote:
>> On 28/03/2017 20:50, Bart Van Assche wrote:
>>> This means that just like the start and end of a discard must be aligned on 
>>> a
>>> discard_granularity boundary, WRITE SAME commands with the UNMAP bit set 
>>> must
>>> also respect that granularity. I think this means that either
>>> __blkdev_issue_zeroout() has to be modified such that it rejects unaligned
>>> REQ_OP_WRITE_ZEROES operations or that blk_bio_write_same_split() has to be
>>> modified such that it generates REQ_OP_WRITEs for the unaligned start and 
>>> tail.
>>
>> I don't think this is the case.
> 
> Hello Paolo,
> 
> Can you cite the section(s) from the SCSI specs that support your view? I
> reread the "5.49 WRITE SAME (10) command" and "4.7.3.4.4 WRITE SAME command
> and unmap operations" sections but I have not found any explicit statement
> that specifies the behavior for unaligned WRITE SAME commands with the UNMAP
> bit set. It seems to me like the OPTIMAL UNMAP GRANULARITY parameter was
> overlooked when both sections were written. Should we ask the T10 committee
> for a clarification?

>From 4.7.3.4.4:

--
If unmap operations are requested in a WRITE SAME command,
then for each specified LBA:

if the Data-Out Buffer of the WRITE SAME command is the same as the
logical block data returned by a read operation from that LBA while in
the unmapped state (see 4.7.4.5), then:

1) the device server performs the actions described in table 6; and

2) if an unmap operation is not performed in step 1), then the device
server shall perform the specified write operation to that LBA;
--

and from the description of WRITE SAME (10): "subsequent read operations
behave as if the device server wrote the single block of user data
received from the Data-Out Buffer to each logical block without
modification" (I have a slightly older copy though, it's 5.45 here).

It's pretty unambiguous that if the device cannot unmap (including the
case where the request is misaligned with respect to the granularity) it
does a write.

> Another question is, if the specification of WRITE SAME + UNMAP would be
> made unambiguous in the SBC document, whether or not we should take the risk
> to trigger behavior that is not what we expect by sending unaligned WRITE
> SAME + UNMAP commands to SCSI devices?

Yes, I think we should.

Paolo

Re: RFC: always use REQ_OP_WRITE_ZEROES for zeroing offload

2017-03-29 Thread Paolo Bonzini

On 23/03/2017 23:53, Lars Ellenberg wrote:
> Thin does not claim to zero data on discard.  which is ok, and correct,
> because it only punches holes on full chunks (or whatever you call
> them), and leaves the rest in the mapping tree as is.
> 
> And that behaviour would prevent DRBD from exposing discards if
> configured on top of thin. (see above)
> 
> But thin *could* easily guarantee zeroing, by simply punching holes
> where it can, and zeroing out the not fully-aligned partial start and
> end of the range.

That's the difference between REQ_OP_DISCARD (only punches holes on full
chunks) and REQ_OP_WRITE_ZEROES with the REQ_UNMAP flag (punches holes +
zeroes incomplete chunks).

dm-thinp's REQ_OP_DISCARD should not do anything for unaligned parts.
Instead, layers above should use REQ_OP_WRITE_ZEROES (with or without
REQ_UNMAP, as required) if they need zeroes.  dm-thinp would have to
split off the partial chunks, and zero them in the lower-level device
with REQ_OP_WRITE_ZEROES.

Paolo

Re: [PATCH 12/23] sd: handle REQ_UNMAP

2017-03-29 Thread Paolo Bonzini

On 28/03/2017 18:48, Bart Van Assche wrote:
>> +if (rq->cmd_flags & REQ_UNMAP) {
>> +switch (sdkp->provisioning_mode) {
>> +case SD_LBP_WS16:
>> +return sd_setup_write_same16_cmnd(cmd, true);
>> +case SD_LBP_WS10:
>> +return sd_setup_write_same10_cmnd(cmd, true);
>> +}
>> +}
>> +
>>  if (sdp->no_write_same)
>>  return BLKPREP_INVALID;
>>  if (sdkp->ws16 || sector > 0x || nr_sectors > 0x)
> Users can change the provisioning mode from user space from SD_LBP_WS16 into
> SD_LBP_WS10 so I'm not sure it's safe to skip the (sdkp->ws16 || sector >
> 0x || nr_sectors > 0x) check if REQ_UNMAP is set.

Yeah, if REQ_UNMAP is set you should probably check sdkp->provisioning_mode
instead of sdkp->ws16, but apart from this it should still go through the
checks below.

Plus, if the provisioning mode is not ws10 or ws16, should
sd_setup_write_zeroes_cmnd:

1) do a WRITE SAME without UNMAP (what Christoph's code does)

2) return BLKPREP_INVALID

3) ignore provisioning mode and do a WRITE SAME with UNMAP

4) do a WRITE SAME without UNMAP for SD_LBP_{ZERO,FULL,DISABLE},
do a WRITE SAME with UNMAP for SD_LBP_{WS10,WS16,UNMAP}.

I'm in favor of (4).  The distinction between SD_LBP_UNMAP, SD_LBP_WS10
and SD_LBP_WS16 is as problematic as discard_zeroes_data in my opinion.

Thanks,

Paolo

Re: [PATCH 23/23] block: remove the discard_zeroes_data flag

2017-03-29 Thread Paolo Bonzini



On 28/03/2017 19:00, Bart Van Assche wrote:
> On Thu, 2017-03-23 at 10:33 -0400, Christoph Hellwig wrote:
>> Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
>> kill this hack.
>>
>> [ ... ]
>>
>> diff --git a/Documentation/ABI/testing/sysfs-block 
>> b/Documentation/ABI/testing/sysfs-block
>> index 2da04ce6aeef..dea212db9df3 100644
>> --- a/Documentation/ABI/testing/sysfs-block
>> +++ b/Documentation/ABI/testing/sysfs-block
>> @@ -213,14 +213,8 @@ What:   
>> /sys/block//queue/discard_zeroes_data
>>  Date:   May 2011
>>  Contact:Martin K. Petersen 
>>  Description:
>> -Devices that support discard functionality may return
>> -stale or random data when a previously discarded block
>> -is read back. This can cause problems if the filesystem
>> -expects discarded blocks to be explicitly cleared. If a
>> -device reports that it deterministically returns zeroes
>> -when a discarded area is read the discard_zeroes_data
>> -parameter will be set to one. Otherwise it will be 0 and
>> -the result of reading a discarded area is undefined.
>> +Will always return 0.  Don't rely on any specific behavior
>> +for discards, and don't read this file.
>>  
>>  What:   /sys/block//queue/write_same_max_bytes
>>  Date:   January 2012
>>
>> [ ... ]
>>
>> --- a/block/blk-sysfs.c
>> +++ b/block/blk-sysfs.c
>> @@ -208,7 +208,7 @@ static ssize_t queue_discard_max_store(struct 
>> request_queue *q,
>>  
>>  static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char 
>> *page)
>>  {
>> -return queue_var_show(queue_discard_zeroes_data(q), page);
>> +return 0;
>>  }
> 
> Hello Christoph,
> 
> It seems to me like the documentation in Documentation/ABI/testing/sysfs-block
> and the above code are not in sync. I think the above code will cause reading
> from the discard_zeroes_data attribute to return an empty string ("") instead
> of "0\n".
> 
> BTW, my personal preference is to remove this attribute entirely because 
> keeping
> it will cause confusion, no matter how well we document the behavior of this
> attribute.

If you remove it, you should probably remove the BLKDISCARDZEROES ioctl too.

That said, the issue with discard_zeroes_data is that it is badly
defined; it was defined as "if I unmap X, will it read as zeroes?" but
this is not how the SCSI standard defines e.g. the UNMAP command with
LBPRZ=1.  But knowing something like LBPRZ ("if something is unmapped,
will it read as zeroes?") _would_ actually be useful for userspace.
This will be especially true once sd maps lseek(SEEK_HOLE/SEEK_DATA) to
the SCSI GET LBA STATUS command, or once dm-thin supports them.

Secondarily, if the former returns 1, userspace is also interested in
knowing "can REQ_OP_WRITE_ZEROES+REQ_UNMAP ever unmap anything?", i.e.
whether BLKDEV_ZERO_NOFALLBACK will ever return anything but
-EOPNOTSUPP.  For SCSI, this should intuitively mean whether LBPWS or
LBPWS10 are set, but the details depend on how the sd driver implements
REQ_OP_WRITE_ZEROES with REQ_UNMAP.

Paolo

Re: [PATCH 03/23] sd: implement REQ_OP_WRITE_ZEROES

2017-03-29 Thread Paolo Bonzini

On 28/03/2017 20:50, Bart Van Assche wrote:
> 
> This means that just like the start and end of a discard must be aligned on a
> discard_granularity boundary, WRITE SAME commands with the UNMAP bit set must
> also respect that granularity. I think this means that either
> __blkdev_issue_zeroout() has to be modified such that it rejects unaligned
> REQ_OP_WRITE_ZEROES operations or that blk_bio_write_same_split() has to be
> modified such that it generates REQ_OP_WRITEs for the unaligned start and 
> tail.

I don't think this is the case.

Rather, Linux should try to align the WRITE SAME commands to the optimal
unmap granularity if the zeroed area requires performing more than one
WRITE SAME command (i.e. > maximum write same length or too large to fit
in the CDB).  However, even in that case it can use WRITE SAME with
UNMAP for the unaligned start and tail.  Unlike the UNMAP command, the
SCSI standard does guarantee that zeroes are written in the unaligned parts.

Paolo

Re: [PATCH 2/2] virtio_scsi: Implement fc_host

2017-01-17 Thread Paolo Bonzini

On 16/01/2017 18:26, Fam Zheng wrote:
>> Is the endianness correct for big-endian host here?
>
> I think so. The fc_host sysfs uses u64 to represent port_name and node_name,
> this patch does the same, so using virtio_* helpers for these fields should
> handle the endianness correctly.

I was suspicious about it because they are defined as "u8 x[8]" in the
virtio_scsi_config struct.  So you would need to read with
virtio_cread_bytes and pass the result to wwn_to_u64.

For example, if you have 0x500123456789abcd this would be

0x50 0x01 0x23 0x45 0x67 0x89 0xab 0cd

in virtio_scsi_config, and then virtio_cread64 would read it as a
little-endian u64, 0xcdab896745230150.  Maybe your QEMU patch is also
writing things as little-endian 64-bit integers, rather than 8-element
arrays of bytes?

Paolo

> Maybe we should use u64 in struct virtio_scsi_config as well?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] virtio-scsi: Implement FC_HOST feature

2017-01-16 Thread Paolo Bonzini


> How it this supposed to work?
> You do export the WWPN/WWNN of the associated host to the guest (nb:
> will get interesting for non NPIV setups ...), but virtio scsi will
> still do a LUN remapping.
> IE the LUNs you see on the host will be different from the LUNs
> presented to the guest.

This is taken care of in the host by presenting to the host all LUNs from
a host's NPIV vHBA.  (Libvirt probably would be the one taking care of this,
because QEMU may not have enough permissions).

> Plus you don't _actually_ expose the FC host, but rather the WWPN of the
> host presenting the LUN.
> So how do you handle LUNs from different FC hosts on the guest?

I'm not sure I understand.

Neither I nor Fam know this stuff very well, but we are trying to do the same
as Hyper-V (and other proprietary hypervisors too).

> Overall, I'm not overly happy with this approach.
> You already added WWPN ids to the virtio transport, so why didn't you
> update the LUN field, too, to avoid this ominous LUN remapping?

Is this your old idea of adding a separate target field to commands,
in order to support 64-bit LUNs?  That is separate, and most FC drivers
only default to 16-bit LUNs anyway.

> And we really should make sure to have a single FC host in the guest
> presenting all LUNs.

Yes, of course.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] virtio_scsi: Implement fc_host

2017-01-16 Thread Paolo Bonzini



On 16/01/2017 17:04, Fam Zheng wrote:
> + node_name = virtio_cread64(vdev,
> + offsetof(struct virtio_scsi_config, primary_wwnn));
> + port_name = virtio_cread64(vdev,
> + offsetof(struct virtio_scsi_config, primary_wwpn));
> + } else {
> + node_name = virtio_cread64(vdev,
> + offsetof(struct virtio_scsi_config, secondary_wwnn));
> + port_name = virtio_cread64(vdev,
> + offsetof(struct virtio_scsi_config, secondary_wwpn));

Is the endianness correct for big-endian host here?

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] ibmvscsi: add write memory barrier to CRQ processing

2016-12-08 Thread Paolo Bonzini



On 08/12/2016 00:31, Tyrel Datwyler wrote:
> The first byte of each CRQ entry is used to indicate whether an entry is
> a valid response or free for the VIOS to use. After processing a
> response the driver sets the valid byte to zero to indicate the entry is
> now free to be reused. Add a memory barrier after this write to ensure
> no other stores are reordered when updating the valid byte.
> 
> Signed-off-by: Tyrel Datwyler 
> ---
>  drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c 
> b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index d9534ee..2f5b07e 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -232,6 +232,7 @@ static void ibmvscsi_task(void *data)
>   while ((crq = crq_queue_next_crq(>queue)) != NULL) {
>   ibmvscsi_handle_crq(crq, hostdata);
>   crq->valid = VIOSRP_CRQ_FREE;
> + wmb();
>   }
>  
>   vio_enable_interrupts(vdev);
> @@ -240,6 +241,7 @@ static void ibmvscsi_task(void *data)
>   vio_disable_interrupts(vdev);
>   ibmvscsi_handle_crq(crq, hostdata);
>   crq->valid = VIOSRP_CRQ_FREE;
> + wmb();

Should this driver use virt_wmb instead?

Paolo

>   } else {
>   done = 1;
>   }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/10] mm: remove write/force parameters from __get_user_pages_unlocked()

2016-10-13 Thread Paolo Bonzini

= 0;
>   unsigned long max_pages_per_loop = PVM_MAX_KMALLOC_PAGES
>   / sizeof(struct pages *);
> + unsigned int flags = FOLL_REMOTE;
>  
>   /* Work out address and page range required */
>   if (len == 0)
>   return 0;
>   nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1;
>  
> + if (vm_write)
> + flags |= FOLL_WRITE;
> +
>   while (!rc && nr_pages && iov_iter_count(iter)) {
>   int pages = min(nr_pages, max_pages_per_loop);
>   size_t bytes;
> @@ -104,8 +108,7 @@ static int process_vm_rw_single_vec(unsigned long addr,
>* current/current->mm
>*/
>   pages = __get_user_pages_unlocked(task, mm, pa, pages,
> -   vm_write, 0, process_pages,
> -   FOLL_REMOTE);
> +   process_pages, flags);
>   if (pages <= 0)
>   return -EFAULT;
>  
> diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
> index db96688..8035cc1 100644
> --- a/virt/kvm/async_pf.c
> +++ b/virt/kvm/async_pf.c
> @@ -84,7 +84,8 @@ static void async_pf_execute(struct work_struct *work)
>* mm and might be done in another context, so we must
>* use FOLL_REMOTE.
>*/
> - __get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL, FOLL_REMOTE);
> + __get_user_pages_unlocked(NULL, mm, addr, 1, NULL,
> + FOLL_WRITE | FOLL_REMOTE);
>  
>   kvm_async_page_present_sync(vcpu, apf);
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 81dfc73..28510e7 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1416,10 +1416,15 @@ static int hva_to_pfn_slow(unsigned long addr, bool 
> *async, bool write_fault,
>   down_read(>mm->mmap_sem);
>   npages = get_user_page_nowait(addr, write_fault, page);
>   up_read(>mm->mmap_sem);
> - } else
> + } else {
> +     unsigned int flags = FOLL_TOUCH | FOLL_HWPOISON;
> +
> + if (write_fault)
> + flags |= FOLL_WRITE;
> +
>   npages = __get_user_pages_unlocked(current, current->mm, addr, 
> 1,
> -write_fault, 0, page,
> -FOLL_TOUCH|FOLL_HWPOISON);
> +page, flags);
> + }
>   if (npages != 1)
>   return npages;
>  
> 

Acked-by: Paolo Bonzini <pbonz...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Remove the unnecessary semicolons

2016-09-15 Thread Paolo Bonzini



On 19/08/2016 04:03, Zhou Jie wrote:
> At the end of funcions, semicolons are unnecessary. So drop them.
> 
> Signed-off-by: Chao Fan <fanc.f...@cn.fujitsu.com>
> ---
>  drivers/scsi/virtio_scsi.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 7dbbb29..9632a0c 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -241,7 +241,7 @@ static void virtscsi_req_done(struct virtqueue *vq)
>   struct virtio_scsi_vq *req_vq = >req_vqs[index];
>  
>   virtscsi_vq_done(vscsi, req_vq, virtscsi_complete_cmd);
> -};
> +}
>  
>  static void virtscsi_poll_requests(struct virtio_scsi *vscsi)
>  {
> @@ -267,7 +267,7 @@ static void virtscsi_ctrl_done(struct virtqueue *vq)
>   struct virtio_scsi *vscsi = shost_priv(sh);
>  
>   virtscsi_vq_done(vscsi, >ctrl_vq, virtscsi_complete_free);
> -};
> +}
>  
>  static void virtscsi_handle_event(struct work_struct *work);
>  
> @@ -413,7 +413,7 @@ static void virtscsi_event_done(struct virtqueue *vq)
>   struct virtio_scsi *vscsi = shost_priv(sh);
>  
>   virtscsi_vq_done(vscsi, >event_vq, virtscsi_complete_event);
> -};
> +}
>  
>  /**
>   * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue
> 

Acked-by: Paolo Bonzini <pbonz...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist

2016-06-06 Thread Paolo Bonzini



On 06/06/2016 17:47, John Snow wrote:
> > > Various downstreams may have backported the VPD fix to older versions,
> > > we need to be careful not to block those, too ... so targeting the core
> > > behavior seems like the more strictly correct, easily maintainable 
> > > solution.
> > 
> > I think this is not practical.  I'm okay with the big hammer if an
> > algorithmic fix is not feasible; but otherwise it does seem a better
> > idea than blacklisting based on inquiry data...
> 
> You think the more practical solution is a SCSI driver that can hang
> because of an incorrect/missing response and to maintain a carefully
> curated blacklist to work around this behavior?

The best solution would be an algorithmic fix, perhaps predicated by
some kind of quirk bit.

A carefully curated blacklist is impossible because you cannot account
for a zillion downstreams, most of which probably don't change the
inquire vendor/product data; version numbers are awful.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist

2016-06-06 Thread Paolo Bonzini



On 06/06/2016 17:41, John Snow wrote:
> On 06/06/2016 11:05 AM, Paolo Bonzini wrote:
>> For ATAPI, you have to blacklist all versions up to 2.2 inclusive.
>>
>> This gives:
>>
>> - QEMU / QEMU CD-ROM / 0.8.(this is IDE and SCSI)
>> - QEMU / QEMU CD-ROM / 0.9.(this is IDE and SCSI)
>> - QEMU / QEMU CD-ROM / 0.10(this is SCSI only)
>> - QEMU / QEMU CD-ROM / 0.11(this is SCSI only)
>> - QEMU / QEMU DVD-ROM / 0.8.   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.9.   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.10   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.11   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.12   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.13   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.14   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 0.15   (this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.0(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.1(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.2(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.3(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.4(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.5(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.6(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 1.7(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 2.0(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 2.1(this is IDE only)
>> - QEMU / QEMU DVD-ROM / 2.2(this is IDE only)
>>
> 
> If this bug is caused by a missing VPD response, Paolo's version history
> here is correct for upstream versions.
> 
> Various downstreams may have backported the VPD fix to older versions,
> we need to be careful not to block those, too ... so targeting the core
> behavior seems like the more strictly correct, easily maintainable solution.

I think this is not practical.  I'm okay with the big hammer if an
algorithmic fix is not feasible; but otherwise it does seem a better
idea than blacklisting based on inquiry data...

Thanks,

Paolo

> Why not just dynamically blacklist devices that fail to respond to VPD
> inquiries?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist

2016-06-06 Thread Paolo Bonzini



On 06/06/2016 16:22, Hannes Reinecke wrote:
> So either we dig into what went wrong with qemu 0.8, or we figure out
> from which qemu version things start to behave nicely, and blacklist
> earlier versions.
> 
> > > Either way, this patch is wrong.
> > 
> > If we can identify which versions work, we can update it.  Otherwise
> > I think we have to be conservative.
> 
> So far we just had this single report where the upstream kernel didn't
> work correctly with a (really old) version of qemu.
> Hardly justifying blacklisting qemu CD-ROM in general.

To further complicate the matter there are two QEMU MMC devices:

1) ATAPI
   - vendor "QEMU" / product name "QEMU CD-ROM" before QEMU 0.10.0
   - vendor "QEMU" / product name "QEMU DVD-ROM" since QEMU 0.10.0

2) native SCSI
   - vendor "QEMU" / product name "QEMU CD-ROM"

VPD in the SCSI CD-ROM probably has always worked, but I would blacklist
up to 0.11 inclusive just to be safe.  Those versions are dead anyway.

VPD in the ATAPI CD-ROM is newer, and that's where the bug was reported on:

> [4.439488] ata2.00: ATAPI: QEMU CD-ROM, 0.8.2, max UDMA/100
> [4.443649] ata2.00: configured for MWDMA2
> [4.450267] scsi 1:0:0:0: CD-ROMQEMU QEMU CD-ROM  0.8. 
> PQ: 0 ANSI: 5
> [4.464317] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 
> frozen
> [4.464319] ata2.00: BMDMA stat 0x5
> [4.464339] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma 
> 16640 in
> [4.464339]  Inquiry 12 01 00 00 ff 00res 
> 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
> [4.464341] ata2.00: status: { DRDY DRQ }

For ATAPI, you have to blacklist all versions up to 2.2 inclusive.

This gives:

- QEMU / QEMU CD-ROM / 0.8.(this is IDE and SCSI)
- QEMU / QEMU CD-ROM / 0.9.(this is IDE and SCSI)
- QEMU / QEMU CD-ROM / 0.10(this is SCSI only)
- QEMU / QEMU CD-ROM / 0.11(this is SCSI only)
- QEMU / QEMU DVD-ROM / 0.8.   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.9.   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.10   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.11   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.12   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.13   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.14   (this is IDE only)
- QEMU / QEMU DVD-ROM / 0.15   (this is IDE only)
- QEMU / QEMU DVD-ROM / 1.0(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.1(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.2(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.3(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.4(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.5(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.6(this is IDE only)
- QEMU / QEMU DVD-ROM / 1.7(this is IDE only)
- QEMU / QEMU DVD-ROM / 2.0(this is IDE only)
- QEMU / QEMU DVD-ROM / 2.1(this is IDE only)
- QEMU / QEMU DVD-ROM / 2.2(this is IDE only)

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Application error handling with write-back caching

2016-05-10 Thread Paolo Bonzini

On 10/05/2016 19:31, James Bottomley wrote:
> > What about a SPACE ALLOCATION FAILED error or a similar error that 
> > can be fixed by administrator actions (or just by a concurrent 
> > process doing an UNMAP)?  Would a subsequent cache flush cause data
> > loss?
>
> You're now asking about how these are handled?  It's not a SCSI
> problem.  I believe if you look at the various layers, RAID would still
> treat it as fatal and fail the drive and so would most filesystems. 
> The AEN warnings for TP are reported, but the admin has to sort it out
> before they become a fatal error.

Thanks, fatal errors are fine I guess.  We were worried that the next
SYNCHRONIZE CACHE would succeed and throw away the writes because it has
already "performed a write medium operation".

POSIX fsync is pretty underspecified in this respect too; gluster has
been throwing away those writes for a long time!  It stopped now because
we explained the issue to them, but it's pointless if the next layer
below does the same---hence Stefan's question.

(In our case the next layer is not the page cache, because we generally
use O_DIRECT.  Evicting dirty pages from the page cache would be okay if
the process(es) that wrote them are SIGKILLed, but in general it would
be a problem too).

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Application error handling with write-back caching

2016-05-10 Thread Paolo Bonzini



On 10/05/2016 16:16, James Bottomley wrote:
> > If "is performed" just means "completes", maybe with an error, the
> > application would have to resubmit write requests and then try to 
> > flush the write cache again.
> > 
> > I'm not aware of applications that keep acknowledged write data 
> > around until the cache flush completion in order to retry writes.
> 
> I think you may be misunderstanding the nature of the returned error. 
> It will be permanent and fatal and usually signal that the device has
> a failed sector that can't be remapped and so the device itself has for
> most purposes failed.  The only recovery is if you happen to have RAID,
> in which case the RAID layer will mostly take care of it.

What about a SPACE ALLOCATION FAILED error or a similar error that can
be fixed by administrator actions (or just by a concurrent process doing
an UNMAP)?  Would a subsequent cache flush cause data loss?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/4] scsi: cleanup ioctl headers and provide UAPI versions

2015-12-28 Thread Paolo Bonzini



On 25/09/2015 11:27, Paolo Bonzini wrote:
> This is v3 of the series to provide an "official" sg.h header (and
> scsi_ioctl.h too, though it's basically obsolete) together with the other
> userspace API definitions.  The change from v2 to v3 is that defaults
> for sg.c are not exported in include/uapi/linux/sg.c.
> 
> Paolo
> 
> 2.5.0
> 

What happened to these patches?...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-11-02 Thread Paolo Bonzini

On 02/11/2015 16:05, Mike Snitzer wrote:
> > In any case, if we don't start path activation we should return
> > ENOTCONN, not ENOTTY.
> 
> Currently, if we don't start path activation we're returning EIO.
> ENOTCONN is used for when we do start path activation (and ENOTCONN is
> the means for DM core to retry)
> 
> We _could_ change the ENOTCONN to be EAGAIN and EIO to ENOTCONN...

This makes sense... though of course testing the impact of this on
userspace is going to be hard. :(  Chances are that userspace is not
expecting EAGAIN either.

Even if they did, how would someone know that they can now retry the
ioctl after getting EAGAIN?  Should they just do it in a loop?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-11-02 Thread Paolo Bonzini



On 02/11/2015 14:56, Hannes Reinecke wrote:
> But then the real question remains:
> 
> What is the 'correct' behaviour for ioctls when no path retry
> is active (or when no paths are present)?
> 
> Should we start path activation?
> If so, should we wait for path activation to finish, risking udev
> killing the worker for that event (udev has a built-in timeout of
> 120 seconds, which we might easily exceed when we need to activate
> paths for large installations or slow path activation ... just
> thinking of NetApp takeover/giveback cycle)?
> If we're not waiting for path activation, where would be the point
> in starting it, seeing that we're not actually interested in the result?
> And if we shouldn't start a path activation, what is the point of
> having code for it in the first place?

That's a fair question, and it depends on what said udev worker actually
does.

In any case, if we don't start path activation we should return
ENOTCONN, not ENOTTY.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-11-02 Thread Paolo Bonzini



On 02/11/2015 08:28, Hannes Reinecke wrote:
> 
> With the original code we would need to wait for path activation
> before we would be able to figure out if the ioctl is valid.
> However, the callback to verify the ioctl is sd_ioctl(), which as
> a first step calls scsi_verify_ioctl().
> So my reasoning was that we can as well call scsi_verify_ioctl()
> early, and allow it to filter out known invalid ioctls.
> Then we wouldn't need to wait for path activation and return
> immediately.

That in principle makes sense.  Unfortunately, before path activation
you can only find out if a ioctl is valid.  You cannot find out which
ioctls are _in_valid, because path activation sets the bdev and that
might make all ioctls valid.

> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
> for path activation, but then just bail out with -ENOTCONN.
> As we're not resetting -ENOTCONN, where's the point in activate the
> paths here?
> Shouldn't we retry to figure out if -ENOTCONN is still true?

That makes sense too.  You've done the work, might as well reap the
benefits...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-11-02 Thread Paolo Bonzini

On 31/10/2015 23:47, Mike Snitzer wrote:
> Yes, with your commit ec8013be ("dm: do not forward ioctls from logical
> volumes to the underlying device") you added protections to disallow
> issuing ioctls to a partition that could impact the rest of the device.
> 
> Given that I can see why you're seizing on the "ti->len !=
> i_size_read(bdev->bd_inode) >> SECTOR_SHIFT" negative checks that gate
> the call to scsi_verify_blk_ioctl().

Right.

> For Hannes, and in my head, it didn't matter if a future bdev satisfies
> the length condition.

I agree actually.  The only problem is that the returned errno value is
ENOTTY, and to userspace that "sounds like" a future bdev will not make
the ioctl valid.

> I could've sworn that unprivledged users (without CAP_SYS_RAWIO)
> wouldn't be allowed to issue ioctls.  Am I completely mistaken?

They are allowed to issue ioctls.

CAP_SYS_RAWIO changes that to also allow issuing of ioctls to
partitions.  That was required by Linus for backwards compatibility.

> Or is
> it still contentious and DM-mpath removing the ability to allow these
> unprivledged ioctls (as a side-effect of Hannes' commit ec8013be) makes
> your life, and other virt users' lives, harder?

Yes, it would.  virt runs as an unprivileged user (so does CD burning,
which was the original reason to let SG_IO run by unprivileged users;
there are probably other use cases).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-10-31 Thread Paolo Bonzini

On 31/10/2015 19:13, Mike Snitzer wrote:
> > But that's wrong, I think.  It's a false positive in
> > scsi_verify_blk_ioctl().
> > 
> > If the ioctl is valid when bdev becomes non-NULL (and it will be if
> > ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
> > you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
> > think the ioctls can go away and come back.  So Hannes's patch broke the
> > userspace ABI. :(
> 
> Huh?  All that Hannes' patch did was add early verification of the ioctl
> if there are no paths, since: there is no point queueing an ioctl that
> is invalid.
> 
> [snip discussion of Christoph's patches]
> 
> The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid.
> It has nothing to do with the existance of a bdev or not; but everything
> to do with the unprivledged user's request to issue an ioctl.

... but the call is skipped (and all ioctls are valid) if ti->len ==
i_size_read(bdev->bd_inode) >> SECTOR_SHIFT.  Therefore, until you have
the bdev you don't know which ioctls are valid, and you must assume all
of them are.  You can't do anything unsafe anyway until you have the
bdev.  This is the reasoning prior to Hannes's change.

Afterwards, you end up calling scsi_verify_blk_ioctl() when bdev ==
NULL.  If the future bdev satisfies the above condition on ti->len, this
means that ioctl(SG_IO) switches from ENOTTY to available.  Userspace is
clearly not expecting that.

> Paolo, AFAIK unprivledged ioctls is one of your pet-projects so your
> insight on what, if anything, needs changing to support them is the
> insight I think we need.

I hope the above provides some extra information.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-10-31 Thread Paolo Bonzini

On 29/10/2015 14:18, Mike Snitzer wrote:
> > 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
> >it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
> > 
> > $ dmesg
> > <...>
> > [] device-mapper: multipath: Failing path 65:144.
> > [] device-mapper: multipath: Failing path 67:144.
> > [] device-mapper: multipath: Failing path 65:224.
> > [] device-mapper: multipath: Failing path 68:32.
> > [] sgio_inquiry: sending ioctl 2285 to a partition!
> 
> So scsi_verify_blk_ioctl() considers the ioctl invalid.

But that's wrong, I think.  It's a false positive in
scsi_verify_blk_ioctl().

If the ioctl is valid when bdev becomes non-NULL (and it will be if
ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
think the ioctls can go away and come back.  So Hannes's patch broke the
userspace ABI. :(

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-25 Thread Paolo Bonzini



On 17/09/2015 17:32, Bart Van Assche wrote:
> 
>> +
>> +#ifndef __KERNEL__
>> +/* Keep in sync with SG_DEFAULT_TIMEOUT of scsi/sg.h  */
>>   #define SG_DEFAULT_TIMEOUT(60*HZ) /* HZ == 'jiffies in 1
>> second' */
>>   #endif
> 
> Is it useful and/or necessary to export this constant ? To me this looks
> like an implementation aspect rather than an aspect of the scsi-sg API.

That's fine, I can remove it.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-25 Thread Paolo Bonzini

Provide a UAPI version of the header in the kernel, making it easier
for interested projects to use an up-to-date version of the header.

The new headers are placed under uapi/linux/ so as not to conflict
with the glibc-provided headers in /usr/include/scsi.

/dev/sgN default values are implementation aspects, and are moved to
drivers/scsi/sg.c instead (together with e.g. SG_ALLOW_DIO_DEF).
However, SG_SCATTER_SZ is used by Wine so it is kept in linux/sg.h
SG_MAX_QUEUE could also be useful.

struct scsi_ioctl_command and struct scsi_idlun used to be under
"#ifdef __KERNEL__", but they are actually useful for userspace as
well.  Add them to the new header.

Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: linux-scsi@vger.kernel.org
Cc: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 drivers/scsi/sg.c |   8 +-
 include/scsi/scsi_ioctl.h |  50 +-
 include/scsi/sg.h | 261 +-
 include/uapi/linux/Kbuild |   2 +
 include/{scsi => uapi/linux}/scsi_ioctl.h |  23 +--
 include/{scsi => uapi/linux}/sg.h |  40 +
 6 files changed, 25 insertions(+), 359 deletions(-)
 copy include/{scsi => uapi/linux}/scsi_ioctl.h (77%)
 copy include/{scsi => uapi/linux}/sg.h (91%)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9d7b7db75e4b..3410e50a37d0 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -90,7 +90,14 @@ static void sg_proc_cleanup(void);
  */
 #define MULDIV(X,MUL,DIV) X % DIV) * MUL) / DIV) + ((X / DIV) * MUL))
 
+#define SG_DEFAULT_TIMEOUT_USER(60*USER_HZ) /* HZ == 'jiffies in 1 
second' */
 #define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
+#define SG_DEFAULT_RETRIES 0
+#define SG_DEF_FORCE_LOW_DMA 0
+#define SG_DEF_FORCE_PACK_ID 0
+#define SG_DEF_KEEP_ORPHAN 0
+#define SG_DEF_COMMAND_Q 0
+#define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index 2bd9d67c201a..d54f9db2e079 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -1,60 +1,12 @@
 #ifndef _SCSI_IOCTL_H
 #define _SCSI_IOCTL_H 
 
-#define SCSI_IOCTL_SEND_COMMAND 1
-#define SCSI_IOCTL_TEST_UNIT_READY 2
-#define SCSI_IOCTL_BENCHMARK_COMMAND 3
-#define SCSI_IOCTL_SYNC 4  /* Request synchronous 
parameters */
-#define SCSI_IOCTL_START_UNIT 5
-#define SCSI_IOCTL_STOP_UNIT 6
-/* The door lock/unlock constants are compatible with Sun constants for
-   the cdrom */
-#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
-#define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
-
-/*
- * Here are some obsolete SCSI-specific ioctl commands.
- *
- * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
- */
-
-/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
-#define SCSI_IOCTL_GET_IDLUN   0x5382
-
-/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
-
-/* Used to obtain the host number of a device. */
-#define SCSI_IOCTL_PROBE_HOST  0x5385
-
-/* Used to obtain the bus number for a device */
-#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
-
-/* Used to obtain the PCI location of a device */
-#define SCSI_IOCTL_GET_PCI 0x5387
-
-#ifdef __KERNEL__
+#include 
 
 struct scsi_device;
 
-/*
- * Structures used for scsi_ioctl et al.
- */
-
-typedef struct scsi_ioctl_command {
-   unsigned int inlen;
-   unsigned int outlen;
-   unsigned char data[0];
-} Scsi_Ioctl_Command;
-
-typedef struct scsi_idlun {
-   __u32 dev_id;
-   __u32 host_unique_id;
-} Scsi_Idlun;
-
-
 int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev,
int cmd, bool ndelay);
 extern int scsi_ioctl(struct scsi_device *, int, void __user *);
 
-#endif /* __KERNEL__ */
 #endif /* _SCSI_IOCTL_H */
diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index 370c78c37926..f9d3d1dace41 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -2,267 +2,8 @@
 #define _SCSI_GENERIC_H
 
 #include 
+#include 
 
-/*
- * History:
- *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
- *  Development Sponsored by Killy Corp. NY NY
- *
- * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
- * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
- *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
- *
- * Documentation
- * =
- * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
- * The documentation for the sg

[PATCH v3 2/4] scsi: cleanup scsi/scsi_ioctl.h

2015-09-25 Thread Paolo Bonzini

SCSI_REMOVAL_* goes together with other SCSI command constants in
include/scsi/scsi.h.  It is also used outside the implementation
of the ioctls (and it is not part of the user API).

scsi_fctargaddress/Scsi_FCTargAddress has had no in-tree use since
commit ca61f10ab2b8 ("[SCSI] remove broken driver cpqfc", 2005-10-29).
Remove it, just in time for the the tenth anniversary of its demise.

Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/scsi.h   | 6 ++
 include/scsi/scsi_ioctl.h | 8 
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index e0a3398b1547..5e2bafdbd96f 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -279,6 +279,12 @@ static inline int scsi_is_wlun(u64 lun)
 #define SCSI_INQ_PQ_NOT_CON 0x01
 #define SCSI_INQ_PQ_NOT_CAP 0x03
 
+/*
+ * PREVENT/ALLOW MEDIUM REMOVAL
+ */
+#defineSCSI_REMOVAL_PREVENT1
+#defineSCSI_REMOVAL_ALLOW  0
+
 
 /*
  * Here are some scsi specific ioctl commands which are sometimes useful.
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index 8d19d1d233c3..c81962bef7a0 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -12,9 +12,6 @@
 #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
 #define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
 
-#defineSCSI_REMOVAL_PREVENT1
-#defineSCSI_REMOVAL_ALLOW  0
-
 #ifdef __KERNEL__
 
 struct scsi_device;
@@ -34,11 +31,6 @@ typedef struct scsi_idlun {
__u32 host_unique_id;
 } Scsi_Idlun;
 
-/* Fibre Channel WWN, port_id struct */
-typedef struct scsi_fctargaddress {
-   __u32 host_port_id;
-   unsigned char host_wwn[8]; // include NULL term.
-} Scsi_FCTargAddress;
 
 int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev,
int cmd, bool ndelay);
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 0/4] scsi: cleanup ioctl headers and provide UAPI versions

2015-09-25 Thread Paolo Bonzini

This is v3 of the series to provide an "official" sg.h header (and
scsi_ioctl.h too, though it's basically obsolete) together with the other
userspace API definitions.  The change from v2 to v3 is that defaults
for sg.c are not exported in include/uapi/linux/sg.c.

Paolo

2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 1/4] scsi: remove old-style type names from sg.h

2015-09-25 Thread Paolo Bonzini

These will not be exported by the new linux/sg.h header, and scsi/sg.h will
not have any user API after linux/sg.h is created.  Since they have no
user in the kernel, they can be zapped.

Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/sg.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index 3afec7032448..370c78c37926 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -207,12 +207,6 @@ typedef struct sg_req_info { /* used by 
SG_GET_REQUEST_TABLE ioctl() */
 
 #define SG_BIG_BUFF SG_DEF_RESERVED_SIZE/* for backward compatibility */
 
-/* Alternate style type names, "..._t" variants preferred */
-typedef struct sg_io_hdr Sg_io_hdr;
-typedef struct sg_io_vec Sg_io_vec;
-typedef struct sg_scsi_id Sg_scsi_id;
-typedef struct sg_req_info Sg_req_info;
-
 
 /* vv */
 /*   The older SG interface based on the 'sg_header' structure follows.   */
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 3/4] scsi: move all obsolete ioctls to scsi_ioctl.h

2015-09-25 Thread Paolo Bonzini

Some are in scsi.h.  Keep them together in preparation for exposing them
in UAPI headers.

Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/scsi.h   | 20 
 include/scsi/scsi_ioctl.h | 20 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5e2bafdbd96f..a96df31af89e 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -286,26 +286,6 @@ static inline int scsi_is_wlun(u64 lun)
 #defineSCSI_REMOVAL_ALLOW  0
 
 
-/*
- * Here are some scsi specific ioctl commands which are sometimes useful.
- *
- * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
- */
-
-/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
-#define SCSI_IOCTL_GET_IDLUN   0x5382
-
-/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
-
-/* Used to obtain the host number of a device. */
-#define SCSI_IOCTL_PROBE_HOST  0x5385
-
-/* Used to obtain the bus number for a device */
-#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
-
-/* Used to obtain the PCI location of a device */
-#define SCSI_IOCTL_GET_PCI 0x5387
-
 /* Pull a u32 out of a SCSI message (using BE SCSI conventions) */
 static inline __u32 scsi_to_u32(__u8 *ptr)
 {
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index c81962bef7a0..2bd9d67c201a 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -12,6 +12,26 @@
 #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
 #define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
 
+/*
+ * Here are some obsolete SCSI-specific ioctl commands.
+ *
+ * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
+ */
+
+/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
+#define SCSI_IOCTL_GET_IDLUN   0x5382
+
+/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
+
+/* Used to obtain the host number of a device. */
+#define SCSI_IOCTL_PROBE_HOST  0x5385
+
+/* Used to obtain the bus number for a device */
+#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
+
+/* Used to obtain the PCI location of a device */
+#define SCSI_IOCTL_GET_PCI 0x5387
+
 #ifdef __KERNEL__
 
 struct scsi_device;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-17 Thread Paolo Bonzini

Provide a UAPI version of the header in the kernel, making it easier
for interested projects to use an up-to-date version of the header.

The new headers are placed under uapi/linux/ so as not to conflict
with the glibc-provided headers in /usr/include/scsi.

struct scsi_ioctl_command and struct scsi_idlun used to be under
"#ifdef __KERNEL__", but they are actually useful for userspace as
well.  Add them to the new header.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/scsi_ioctl.h |  50 +-
 include/scsi/sg.h | 260 +-
 include/uapi/linux/Kbuild |   2 +
 include/{scsi => uapi/linux}/scsi_ioctl.h |  23 +--
 include/{scsi => uapi/linux}/sg.h |  20 +--
 5 files changed, 20 insertions(+), 335 deletions(-)
 copy include/{scsi => uapi/linux}/scsi_ioctl.h (77%)
 copy include/{scsi => uapi/linux}/sg.h (97%)

diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index 2bd9d67c201a..d54f9db2e079 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -1,60 +1,12 @@
 #ifndef _SCSI_IOCTL_H
 #define _SCSI_IOCTL_H 
 
-#define SCSI_IOCTL_SEND_COMMAND 1
-#define SCSI_IOCTL_TEST_UNIT_READY 2
-#define SCSI_IOCTL_BENCHMARK_COMMAND 3
-#define SCSI_IOCTL_SYNC 4  /* Request synchronous 
parameters */
-#define SCSI_IOCTL_START_UNIT 5
-#define SCSI_IOCTL_STOP_UNIT 6
-/* The door lock/unlock constants are compatible with Sun constants for
-   the cdrom */
-#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
-#define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
-
-/*
- * Here are some obsolete SCSI-specific ioctl commands.
- *
- * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
- */
-
-/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
-#define SCSI_IOCTL_GET_IDLUN   0x5382
-
-/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
-
-/* Used to obtain the host number of a device. */
-#define SCSI_IOCTL_PROBE_HOST  0x5385
-
-/* Used to obtain the bus number for a device */
-#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
-
-/* Used to obtain the PCI location of a device */
-#define SCSI_IOCTL_GET_PCI 0x5387
-
-#ifdef __KERNEL__
+#include 
 
 struct scsi_device;
 
-/*
- * Structures used for scsi_ioctl et al.
- */
-
-typedef struct scsi_ioctl_command {
-   unsigned int inlen;
-   unsigned int outlen;
-   unsigned char data[0];
-} Scsi_Ioctl_Command;
-
-typedef struct scsi_idlun {
-   __u32 dev_id;
-   __u32 host_unique_id;
-} Scsi_Idlun;
-
-
 int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev,
int cmd, bool ndelay);
 extern int scsi_ioctl(struct scsi_device *, int, void __user *);
 
-#endif /* __KERNEL__ */
 #endif /* _SCSI_IOCTL_H */
diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index 370c78c37926..1c80177b9793 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -2,267 +2,11 @@
 #define _SCSI_GENERIC_H
 
 #include 
+#include 
 
-/*
- * History:
- *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
- *  Development Sponsored by Killy Corp. NY NY
- *
- * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
- * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
- *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
- *
- * Documentation
- * =
- * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
- * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
- * Also see: /Documentation/scsi/scsi-generic.txt
- *
- * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
- */
-
-#ifdef __KERNEL__
 extern int sg_big_buff; /* for sysctl */
-#endif
-
-
-typedef struct sg_iovec /* same structure as used by readv() Linux system */
-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
- or scatter gather list */
-unsigned char __user *cmdp; /*

[PATCH v2 2/4] scsi: cleanup scsi/scsi_ioctl.h

2015-09-17 Thread Paolo Bonzini

SCSI_REMOVAL_* goes together with other SCSI command constants in
include/scsi/scsi.h.  It is also used outside the implementation
of the ioctls (and it is not part of the user API).

scsi_fctargaddress/Scsi_FCTargAddress has had no in-tree use since
commit ca61f10ab2b8 ("[SCSI] remove broken driver cpqfc", 2005-10-29).
Remove it, just in time for the the tenth anniversary of its demise.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/scsi.h   | 6 ++
 include/scsi/scsi_ioctl.h | 8 
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index e0a3398b1547..5e2bafdbd96f 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -279,6 +279,12 @@ static inline int scsi_is_wlun(u64 lun)
 #define SCSI_INQ_PQ_NOT_CON 0x01
 #define SCSI_INQ_PQ_NOT_CAP 0x03
 
+/*
+ * PREVENT/ALLOW MEDIUM REMOVAL
+ */
+#defineSCSI_REMOVAL_PREVENT1
+#defineSCSI_REMOVAL_ALLOW  0
+
 
 /*
  * Here are some scsi specific ioctl commands which are sometimes useful.
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index 8d19d1d233c3..c81962bef7a0 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -12,9 +12,6 @@
 #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
 #define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
 
-#defineSCSI_REMOVAL_PREVENT1
-#defineSCSI_REMOVAL_ALLOW  0
-
 #ifdef __KERNEL__
 
 struct scsi_device;
@@ -34,11 +31,6 @@ typedef struct scsi_idlun {
__u32 host_unique_id;
 } Scsi_Idlun;
 
-/* Fibre Channel WWN, port_id struct */
-typedef struct scsi_fctargaddress {
-   __u32 host_port_id;
-   unsigned char host_wwn[8]; // include NULL term.
-} Scsi_FCTargAddress;
 
 int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev,
int cmd, bool ndelay);
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 3/4] scsi: move all obsolete ioctls to scsi/scsi_ioctl.h

2015-09-17 Thread Paolo Bonzini

Some are in scsi/scsi.h.  Keep them together in preparation for exposing
them in UAPI headers.

Suggested-by: Christoph Hellwig <h...@lst.de>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/scsi.h   | 20 
 include/scsi/scsi_ioctl.h | 20 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5e2bafdbd96f..a96df31af89e 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -286,26 +286,6 @@ static inline int scsi_is_wlun(u64 lun)
 #defineSCSI_REMOVAL_ALLOW  0
 
 
-/*
- * Here are some scsi specific ioctl commands which are sometimes useful.
- *
- * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
- */
-
-/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
-#define SCSI_IOCTL_GET_IDLUN   0x5382
-
-/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
-
-/* Used to obtain the host number of a device. */
-#define SCSI_IOCTL_PROBE_HOST  0x5385
-
-/* Used to obtain the bus number for a device */
-#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
-
-/* Used to obtain the PCI location of a device */
-#define SCSI_IOCTL_GET_PCI 0x5387
-
 /* Pull a u32 out of a SCSI message (using BE SCSI conventions) */
 static inline __u32 scsi_to_u32(__u8 *ptr)
 {
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index c81962bef7a0..2bd9d67c201a 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -12,6 +12,26 @@
 #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
 #define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
 
+/*
+ * Here are some obsolete SCSI-specific ioctl commands.
+ *
+ * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395
+ */
+
+/* Used to obtain PUN and LUN info.  Conflicts with CDROMAUDIOBUFSIZ */
+#define SCSI_IOCTL_GET_IDLUN   0x5382
+
+/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */
+
+/* Used to obtain the host number of a device. */
+#define SCSI_IOCTL_PROBE_HOST  0x5385
+
+/* Used to obtain the bus number for a device */
+#define SCSI_IOCTL_GET_BUS_NUMBER  0x5386
+
+/* Used to obtain the PCI location of a device */
+#define SCSI_IOCTL_GET_PCI 0x5387
+
 #ifdef __KERNEL__
 
 struct scsi_device;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 0/4] scsi: cleanup ioctl headers and provide UAPI versions

2015-09-17 Thread Paolo Bonzini

This is v2 of the series I sent a few days ago.  A bit heavier on
the cleanups:

- export the structs for obsolete SCSI_IOCTL_SEND_COMMAND in
  uapi/linux/scsi_ioctl.h

- reduce the kernel's scsi/scsi_ioctl.h even more

- remove obsolete types from uapi/linux/sg.h

- place all obsolete ioctls in uapi/linux/scsi_ioctl.h, even those
  formerly in scsi/scsi.h [requested by hch]

... and reverting to the flat uapi/linux/ structure instead of
creating uapi/linux/scsi/.

Paolo

Paolo Bonzini (4):
  scsi: remove old-style type names from sg.h
  scsi: cleanup scsi/scsi_ioctl.h
  scsi: move all obsolete ioctls to scsi_ioctl.h
  scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

 include/scsi/scsi.h   |  20 +--
 include/scsi/scsi_ioctl.h |  38 +
 include/scsi/sg.h | 266 +-
 include/uapi/linux/Kbuild |   2 +
 include/{scsi => uapi/linux}/scsi_ioctl.h |  41 +++--
 include/{scsi => uapi/linux}/sg.h |  26 +--
 6 files changed, 38 insertions(+), 355 deletions(-)
 copy include/{scsi => uapi/linux}/scsi_ioctl.h (51%)
 copy include/{scsi => uapi/linux}/sg.h (95%)

-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 1/4] scsi: remove old-style type names from sg.h

2015-09-17 Thread Paolo Bonzini

These will not be exported by the new linux/sg.h header, and scsi/sg.h will
not have any user API after linux/sg.h is created.  Since they have no
user in the kernel, they can be zapped.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 include/scsi/sg.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index 3afec7032448..370c78c37926 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -207,12 +207,6 @@ typedef struct sg_req_info { /* used by 
SG_GET_REQUEST_TABLE ioctl() */
 
 #define SG_BIG_BUFF SG_DEF_RESERVED_SIZE/* for backward compatibility */
 
-/* Alternate style type names, "..._t" variants preferred */
-typedef struct sg_io_hdr Sg_io_hdr;
-typedef struct sg_io_vec Sg_io_vec;
-typedef struct sg_scsi_id Sg_scsi_id;
-typedef struct sg_req_info Sg_req_info;
-
 
 /* vv */
 /*   The older SG interface based on the 'sg_header' structure follows.   */
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-16 Thread Paolo Bonzini

On 16/09/2015 15:39, Christoph Hellwig wrote:
> Hi Paolo,
> 
> can you just move them to include/linux/ directly?

I'm trying to cater to the objections that were made to Andy's patch.
If you mean the non-UAPI headers, James wanted them to stay in scsi/; if
you mean the UAPI headers, Douglas complained about the flat structure
of include/linux/.

> Also scsi/scsi.h has some additional ioctl defintions that should be
> added to the UAPI scsi_ioctl.h.  Otherwise this looks ok to me.

I can do this, but I think they are obsoleted by SG_GET_SCSI_ID, except
for SCSI_IOCTL_GET_PCI which is horrible anyway. :)  The comment about
conflicts with CDROM ioctls is also interesting.  I can see why one
would want to keep them out of the new canonical place for SCSI ioctls.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-16 Thread Paolo Bonzini



On 16/09/2015 16:23, Christoph Hellwig wrote:
> > I'm trying to cater to the objections that were made to Andy's patch.
> > If you mean the non-UAPI headers, James wanted them to stay in scsi/; if
> > you mean the UAPI headers, Douglas complained about the flat structure
> > of include/linux/.
> 
> Yes, keep the non-UAPI ones as is and move the UAPI ones to linux/ - the flat
> structure is a feature, not a bug.
> 
> > > Also scsi/scsi.h has some additional ioctl defintions that should be
> > > added to the UAPI scsi_ioctl.h.  Otherwise this looks ok to me.
> > 
> > I can do this, but I think they are obsoleted by SG_GET_SCSI_ID, except
> > for SCSI_IOCTL_GET_PCI which is horrible anyway. :)  The comment about
> > conflicts with CDROM ioctls is also interesting.  I can see why one
> > would want to keep them out of the new canonical place for SCSI ioctls.
> 
> No point in hiding them.  We can still officially deprecate them, but I'd
> much rather have them in a single place than hidden away somewhere.

Ok, will do both.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-15 Thread Paolo Bonzini

Provide a UAPI version of the header in the kernel, making it easier
for interested projects to use an up-to-date version of the header.

The new headers are placed under uapi/linux/scsi/ so as not to conflict
with the glibc-provided headers in /usr/include/scsi.

Cc: Andy Grover <agro...@redhat.com>
Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Douglas Gilbert <dgilb...@interlog.com>
Cc: Linux SCSI List <linux-scsi@vger.kernel.org>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
Similar to Andy Grover's patch from last January, though developed
independently.  Based on the remarks to Andy's patch, I'm leaving
scsi.h aside (Christoph), leaving the internal header in scsi/sg.h
(James) and moving the headers to include/linux/scsi/ (Douglas).

 MAINTAINERS|   1 +
 include/scsi/scsi_ioctl.h  |  17 +-
 include/scsi/sg.h  | 267 +
 include/uapi/linux/Kbuild  |   1 +
 include/uapi/linux/scsi/Kbuild |   2 +
 include/{ => uapi/linux}/scsi/scsi_ioctl.h |  36 +---
 include/{ => uapi/linux}/scsi/sg.h |  18 +-
 7 files changed, 17 insertions(+), 325 deletions(-)
 create mode 100644 include/uapi/linux/scsi/Kbuild
 copy include/{ => uapi/linux}/scsi/scsi_ioctl.h (41%)
 copy include/{ => uapi/linux}/scsi/sg.h (97%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 67a4443daed9..b40764d43dde 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9173,6 +9173,7 @@ T:git 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git
 S: Maintained
 F: drivers/scsi/
 F: include/scsi/
+F: include/uapi/linux/scsi/
 
 SCSI TAPE DRIVER
 M: Kai Mäkisara <kai.makis...@kolumbus.fi>
diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
index 8d19d1d233c3..5b07a0a91675 100644
--- a/include/scsi/scsi_ioctl.h
+++ b/include/scsi/scsi_ioctl.h
@@ -1,21 +1,7 @@
 #ifndef _SCSI_IOCTL_H
 #define _SCSI_IOCTL_H 
 
-#define SCSI_IOCTL_SEND_COMMAND 1
-#define SCSI_IOCTL_TEST_UNIT_READY 2
-#define SCSI_IOCTL_BENCHMARK_COMMAND 3
-#define SCSI_IOCTL_SYNC 4  /* Request synchronous 
parameters */
-#define SCSI_IOCTL_START_UNIT 5
-#define SCSI_IOCTL_STOP_UNIT 6
-/* The door lock/unlock constants are compatible with Sun constants for
-   the cdrom */
-#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */
-#define SCSI_IOCTL_DOORUNLOCK 0x5381   /* unlock the mechanism   */
-
-#defineSCSI_REMOVAL_PREVENT1
-#defineSCSI_REMOVAL_ALLOW  0
-
-#ifdef __KERNEL__
+#include 
 
 struct scsi_device;
 
@@ -44,5 +30,4 @@ int scsi_ioctl_block_when_processing_errors(struct 
scsi_device *sdev,
int cmd, bool ndelay);
 extern int scsi_ioctl(struct scsi_device *, int, void __user *);
 
-#endif /* __KERNEL__ */
 #endif /* _SCSI_IOCTL_H */
diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index 3afec7032448..87b35e7c4c31 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -1,274 +1,11 @@
 #ifndef _SCSI_GENERIC_H
 #define _SCSI_GENERIC_H
 
-#include 
+#include 
 
-/*
- * History:
- *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
- *  Development Sponsored by Killy Corp. NY NY
- *
- * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
- * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
- *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
- *
- * Documentation
- * =
- * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
- * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
- * Also see: /Documentation/scsi/scsi-generic.txt
- *
- * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
- */
-
-#ifdef __KERNEL__
 extern int sg_big_buff; /* for sysctl */
-#endif
-
-
-typedef struct sg_iovec /* same structure as used by readv() Linux system */
-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
-

Re: [PATCH v2] virtio_scsi: don't select CONFIG_BLK_DEV_INTEGRITY

2015-04-27 Thread Paolo Bonzini



On 27/04/2015 14:56, Christoph Hellwig wrote:
 T10 PI is just another optional feature, LLDDs should work without
 the infrastructure.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/Kconfig   |  1 -
  drivers/scsi/virtio_scsi.c | 11 ++-
  2 files changed, 10 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
 index b021bcb..896bea6 100644
 --- a/drivers/scsi/Kconfig
 +++ b/drivers/scsi/Kconfig
 @@ -1743,7 +1743,6 @@ config SCSI_BFA_FC
  config SCSI_VIRTIO
   tristate virtio-scsi support
   depends on VIRTIO
 - select BLK_DEV_INTEGRITY
   help
This is the virtual HBA driver for virtio.  If the kernel will
be used in a virtual machine, say Y or M.
 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index f164f24..285f775 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -501,6 +501,7 @@ static void virtio_scsi_init_hdr(struct virtio_device 
 *vdev,
   cmd-crn = 0;
  }
  
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
  static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev,
   struct virtio_scsi_cmd_req_pi *cmd_pi,
   struct scsi_cmnd *sc)
 @@ -524,6 +525,7 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device 
 *vdev,
  blk_rq_sectors(rq) *
  bi-tuple_size);
  }
 +#endif
  
  static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
struct virtio_scsi_vq *req_vq,
 @@ -546,11 +548,14 @@ static int virtscsi_queuecommand(struct virtio_scsi 
 *vscsi,
  
   BUG_ON(sc-cmd_len  VIRTIO_SCSI_CDB_SIZE);
  
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
   if (virtio_has_feature(vscsi-vdev, VIRTIO_SCSI_F_T10_PI)) {
   virtio_scsi_init_hdr_pi(vscsi-vdev, cmd-req.cmd_pi, sc);
   memcpy(cmd-req.cmd_pi.cdb, sc-cmnd, sc-cmd_len);
   req_size = sizeof(cmd-req.cmd_pi);
 - } else {
 + } else
 +#endif
 + {
   virtio_scsi_init_hdr(vscsi-vdev, cmd-req.cmd, sc);
   memcpy(cmd-req.cmd.cdb, sc-cmnd, sc-cmd_len);
   req_size = sizeof(cmd-req.cmd);
 @@ -1002,6 +1007,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
   shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
   shost-nr_hw_queues = num_queues;
  
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
   if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {
   host_prot = SHOST_DIF_TYPE1_PROTECTION | 
 SHOST_DIF_TYPE2_PROTECTION |
   SHOST_DIF_TYPE3_PROTECTION | 
 SHOST_DIX_TYPE1_PROTECTION |
 @@ -1010,6 +1016,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
   scsi_host_set_prot(shost, host_prot);
   scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
   }
 +#endif
  
   err = scsi_add_host(shost, vdev-dev);
   if (err)
 @@ -1090,7 +1097,9 @@ static struct virtio_device_id id_table[] = {
  static unsigned int features[] = {
   VIRTIO_SCSI_F_HOTPLUG,
   VIRTIO_SCSI_F_CHANGE,
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
   VIRTIO_SCSI_F_T10_PI,
 +#endif
  };
  
  static struct virtio_driver virtio_scsi_driver = {
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] virtio_scsi: don't select CONFIG_BLK_DEV_INTEGRITY

2015-04-24 Thread Paolo Bonzini



On 23/04/2015 20:10, Christoph Hellwig wrote:
 T10 PI is just another optional feature, LLDDs should work without
 the infrastructure.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/Kconfig   |  1 -
  drivers/scsi/virtio_scsi.c | 11 ++-
  2 files changed, 10 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
 index b021bcb..896bea6 100644
 --- a/drivers/scsi/Kconfig
 +++ b/drivers/scsi/Kconfig
 @@ -1743,7 +1743,6 @@ config SCSI_BFA_FC
  config SCSI_VIRTIO
   tristate virtio-scsi support
   depends on VIRTIO
 - select BLK_DEV_INTEGRITY
   help
This is the virtual HBA driver for virtio.  If the kernel will
be used in a virtual machine, say Y or M.
 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index f164f24..0f2bb5b 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -501,6 +501,7 @@ static void virtio_scsi_init_hdr(struct virtio_device 
 *vdev,
   cmd-crn = 0;
  }
  
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
  static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev,
   struct virtio_scsi_cmd_req_pi *cmd_pi,
   struct scsi_cmnd *sc)
 @@ -524,6 +525,7 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device 
 *vdev,
  blk_rq_sectors(rq) *
  bi-tuple_size);
  }
 +#endif
  
  static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
struct virtio_scsi_vq *req_vq,
 @@ -546,11 +548,14 @@ static int virtscsi_queuecommand(struct virtio_scsi 
 *vscsi,
  
   BUG_ON(sc-cmd_len  VIRTIO_SCSI_CDB_SIZE);
  
 +#ifdef CONFIG_BLK_DEV_INTEGRITY
   if (virtio_has_feature(vscsi-vdev, VIRTIO_SCSI_F_T10_PI)) {
   virtio_scsi_init_hdr_pi(vscsi-vdev, cmd-req.cmd_pi, sc);
   memcpy(cmd-req.cmd_pi.cdb, sc-cmnd, sc-cmd_len);
   req_size = sizeof(cmd-req.cmd_pi);
 - } else {
 + } else
 +#endif
 + {

All good so far.

   virtio_scsi_init_hdr(vscsi-vdev, cmd-req.cmd, sc);
   memcpy(cmd-req.cmd.cdb, sc-cmnd, sc-cmd_len);
   req_size = sizeof(cmd-req.cmd);
 @@ -1002,6 +1007,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
   shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
   shost-nr_hw_queues = num_queues;
  
 +#ifdef VIRTIO_SCSI_F_T10_PI

This symbol is always defined; it is part of the uapi/ header.

I think you wanted #ifdef CONFIG_BLK_DEV_INTEGRITY here as well.
However, you can remove this #ifdef completely.  This is because...

   if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {

... this condition should always be false unless the feature is
negotiated, and the feature should not be negotiated when
CONFIG_BLK_DEV_INTEGRITY is disabled.

Maybe you can add a WARN_ON(!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY))
depending on your taste.

   host_prot = SHOST_DIF_TYPE1_PROTECTION | 
 SHOST_DIF_TYPE2_PROTECTION |
   SHOST_DIF_TYPE3_PROTECTION | 
 SHOST_DIX_TYPE1_PROTECTION |

Also, if you leave the #ifdef, host_prot is now unused if
CONFIG_BLK_DEV_INTEGRITY is disabled.  If you prefer to keep the #ifdef
you should also change host_prot to a macro (e.g. VIRTIO_SCSI_HOST_PROT)
to silence the compiler warning.

 @@ -1010,6 +1016,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
   scsi_host_set_prot(shost, host_prot);
   scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
   }
 +#endif
  
   err = scsi_add_host(shost, vdev-dev);
   if (err)
 @@ -1090,7 +1097,9 @@ static struct virtio_device_id id_table[] = {
  static unsigned int features[] = {
   VIRTIO_SCSI_F_HOTPLUG,
   VIRTIO_SCSI_F_CHANGE,
 +#ifdef VIRTIO_SCSI_F_T10_PI

This one definitely should be #ifdef CONFIG_BLK_DEV_INTEGRITY.

Thanks,

Paolo

   VIRTIO_SCSI_F_T10_PI,
 +#endif
  };
  
  static struct virtio_driver virtio_scsi_driver = {
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] Fix return code for ioctl( BLKRRPART ) if device is down

2015-03-24 Thread Paolo Bonzini



On 24/03/2015 11:16, Fam Zheng wrote:
 If issued right after link down, blockdev --rereadpt will be stuck for a
 while and then return normally. Although the underlying capacity and partition
 table are not correctly updated. And it means that userspace can't detect the
 error at all.
 
 Fix this by propargating the error of read capacity command through the
 stack, so that the ioctl could fail with -EIO.
 
 Fam Zheng (3):
   block: Return error in rescan_partitions if revalidating disk failed
   sd: Return error in sd_revalidate_disk if read capacity failed
   sd: Return -EIO if read capacity failed
 
  block/partition-generic.c |  6 +++---
  drivers/scsi/sd.c | 22 +-
  2 files changed, 16 insertions(+), 12 deletions(-)
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Though patch 3 could be seen as a change in userspace ABI, so I'm less
sure about it.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-03-16 Thread Paolo Bonzini



On 05/03/2015 14:37, Paolo Bonzini wrote:
 
 
 On 05/03/2015 14:33, Christoph Hellwig wrote:
 Any chance to get reviews for this series?  Also we should at least
 expedite this first patch into 4.0-rc as it fixes scanning races
 in virtio_scsi.
 
 I reviewed 1 and 3, but I'm not really qualified for patch 2.

Christoph,

any news about these patches?

Thanks,

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get

2015-03-05 Thread Paolo Bonzini



On 02/02/2015 14:01, Christoph Hellwig wrote:
 This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on
 module removal (and individual device removal) and dc4515ea (scsi: always
 increment reference count).
 
 We now never call scsi_device_get from the shutdown path, and the fact
 that we started grabbing reference there in commit 85b6c7 turned out
 turned out to create more problems than it solves, and required
 workarounds for workarounds for workarounds. Move back to properly checking
 the device state and carefully handle module refcounting.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 ---
  drivers/scsi/scsi.c | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
 index 9b38299..9b7fd0b 100644
 --- a/drivers/scsi/scsi.c
 +++ b/drivers/scsi/scsi.c
 @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode);
   * Description: Gets a reference to the scsi_device and increments the use 
 count
   * of the underlying LLDD module.  You must hold host_lock of the
   * parent Scsi_Host or already have a reference when calling this.
 + *
 + * This will fail if a device is deleted or cancelled, or when the LLD module
 + * is in the process of being unloaded.
   */
  int scsi_device_get(struct scsi_device *sdev)
  {
 - if (sdev-sdev_state == SDEV_DEL)
 - return -ENXIO;
 + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL)
 + goto fail;
   if (!get_device(sdev-sdev_gendev))
 - return -ENXIO;
 - /* We can fail try_module_get if we're doing SCSI operations
 -  * from module exit (like cache flush) */
 - __module_get(sdev-host-hostt-module);
 -
 + goto fail;
 + if (!try_module_get(sdev-host-hostt-module))
 + goto fail_put_device;
   return 0;
 +
 +fail_put_device:
 + put_device(sdev-sdev_gendev);
 +fail:
 + return -ENXIO;
  }
  EXPORT_SYMBOL(scsi_device_get);
  
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-03-05 Thread Paolo Bonzini



On 02/02/2015 14:01, Christoph Hellwig wrote:
 Lock the device embedded in the scsi_device to protect against
 concurrent calls to -remove.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 Acked-by: Alan Stern st...@rowland.harvard.edu
 ---
  drivers/scsi/scsi_scan.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
 index 983aed1..523faee 100644
 --- a/drivers/scsi/scsi_scan.c
 +++ b/drivers/scsi/scsi_scan.c
 @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device);
  
  void scsi_rescan_device(struct device *dev)
  {
 - if (!dev-driver)
 - return;
 -
 - if (try_module_get(dev-driver-owner)) {
 + device_lock(dev);
 + if (dev-driver  try_module_get(dev-driver-owner)) {
   struct scsi_driver *drv = to_scsi_driver(dev-driver);
  
   if (drv-rescan)
   drv-rescan(dev);
   module_put(dev-driver-owner);
   }
 + device_unlock(dev);
  }
  EXPORT_SYMBOL(scsi_rescan_device);
  
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-03-05 Thread Paolo Bonzini



On 05/03/2015 14:33, Christoph Hellwig wrote:
 Any chance to get reviews for this series?  Also we should at least
 expedite this first patch into 4.0-rc as it fixes scanning races
 in virtio_scsi.

I reviewed 1 and 3, but I'm not really qualified for patch 2.

Paolo

 On Mon, Feb 02, 2015 at 02:01:24PM +0100, Christoph Hellwig wrote:
 Lock the device embedded in the scsi_device to protect against
 concurrent calls to -remove.

 Signed-off-by: Christoph Hellwig h...@lst.de
 Acked-by: Alan Stern st...@rowland.harvard.edu
 ---
  drivers/scsi/scsi_scan.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

 diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
 index 983aed1..523faee 100644
 --- a/drivers/scsi/scsi_scan.c
 +++ b/drivers/scsi/scsi_scan.c
 @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device);
  
  void scsi_rescan_device(struct device *dev)
  {
 -if (!dev-driver)
 -return;
 -
 -if (try_module_get(dev-driver-owner)) {
 +device_lock(dev);
 +if (dev-driver  try_module_get(dev-driver-owner)) {
  struct scsi_driver *drv = to_scsi_driver(dev-driver);
  
  if (drv-rescan)
  drv-rescan(dev);
  module_put(dev-driver-owner);
  }
 +device_unlock(dev);
  }
  EXPORT_SYMBOL(scsi_rescan_device);
  
 -- 
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 ---end quoted text---
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-02-02 Thread Paolo Bonzini



On 02/02/2015 13:59, Christoph Hellwig wrote:
 On Fri, Jan 30, 2015 at 10:46:17AM +0100, Paolo Bonzini wrote:
  Great, we might want to revert that patch in 3.21.
 Is that fix in any tree yet?  Seems like I missed it for the scsi
 tree at least.  So unless you want it for 3.19/stable we might as well
 ust skip that patch.

Yes, I agree.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/8] lib/iovec: Add memcpy_fromiovec_out library function

2015-01-30 Thread Paolo Bonzini



On 30/01/2015 09:12, Nicholas A. Bellinger wrote:
 From: Nicholas Bellinger n...@linux-iscsi.org
 
 This patch adds a new memcpy_fromiovec_out() library function which modifies
 the passed *iov following memcpy_fromiovec(), but also returns the next 
 current
 iovec pointer via **iov_out.
 
 This is useful for vhost ANY_LAYOUT support when guests are allowed to 
 generate
 incoming virtio request headers combined with subsequent SGL payloads into a
 single iovec.
 
 Cc: Michael S. Tsirkin m...@redhat.com
 Cc: Paolo Bonzini pbonz...@redhat.com
 Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org
 ---
  include/linux/uio.h |  2 ++
  lib/iovec.c | 27 +++
  2 files changed, 29 insertions(+)
 
 diff --git a/include/linux/uio.h b/include/linux/uio.h
 index 1c5e453..3e4473d 100644
 --- a/include/linux/uio.h
 +++ b/include/linux/uio.h
 @@ -136,6 +136,8 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, 
 __wsum *csum, struct iov_
  size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, 
 struct iov_iter *i);
  
  int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len);
 +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov,
 +  struct iovec **iov_out, int len);
  int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
   int offset, int len);
  int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata,
 diff --git a/lib/iovec.c b/lib/iovec.c
 index 2d99cb4..6a813dd 100644
 --- a/lib/iovec.c
 +++ b/lib/iovec.c
 @@ -28,6 +28,33 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec 
 *iov, int len)
  EXPORT_SYMBOL(memcpy_fromiovec);
  
  /*
 + *   Copy iovec to kernel, saving the current iov to *iov_out.
 + *   Returns -EFAULT on error.

Perhaps document that iov is modified, zeroing everything in [iov,
*iov_out) and possibly removing the front of *iov_out?

Paolo

 + */
 +
 +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov,
 +  struct iovec **iov_out, int len)
 +{
 + while (len  0) {
 + if (iov-iov_len) {
 + int copy = min_t(unsigned int, len, iov-iov_len);
 + if (copy_from_user(kdata, iov-iov_base, copy))
 + return -EFAULT;
 + len -= copy;
 + kdata += copy;
 + iov-iov_base += copy;
 + iov-iov_len -= copy;
 + }
 + if (!iov-iov_len)
 + iov++;
 + }
 + *iov_out = iov;
 +
 + return 0;
 +}
 +EXPORT_SYMBOL(memcpy_fromiovec_out);
 +
 +/*
   *   Copy kernel to iovec. Returns -EFAULT on error.
   */
  
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-01-30 Thread Paolo Bonzini



On 30/01/2015 02:08, Fam Zheng wrote:
 On Fri, 01/30 00:11, Paolo Bonzini wrote:


 On 29/01/2015 00:00, Christoph Hellwig wrote:
 Lock the device embedded in the scsi_device to protect against
 concurrent calls to -remove.

 Signed-off-by: Christoph Hellwig h...@lst.de

 I wonder if this makes this problem: https://lkml.org/lkml/2015/1/5/9 go
 away.
 
 A quick test says yes.

Great, we might want to revert that patch in 3.21.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] scsi: serialize -rescan against -remove

2015-01-29 Thread Paolo Bonzini



On 29/01/2015 00:00, Christoph Hellwig wrote:
 Lock the device embedded in the scsi_device to protect against
 concurrent calls to -remove.
 
 Signed-off-by: Christoph Hellwig h...@lst.de

I wonder if this makes this problem: https://lkml.org/lkml/2015/1/5/9 go
away.

Paolo

 ---
  drivers/scsi/scsi_scan.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
 index 983aed1..523faee 100644
 --- a/drivers/scsi/scsi_scan.c
 +++ b/drivers/scsi/scsi_scan.c
 @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device);
  
  void scsi_rescan_device(struct device *dev)
  {
 - if (!dev-driver)
 - return;
 -
 - if (try_module_get(dev-driver-owner)) {
 + device_lock(dev);
 + if (dev-driver  try_module_get(dev-driver-owner)) {
   struct scsi_driver *drv = to_scsi_driver(dev-driver);
  
   if (drv-rescan)
   drv-rescan(dev);
   module_put(dev-driver-owner);
   }
 + device_unlock(dev);
  }
  EXPORT_SYMBOL(scsi_rescan_device);
  
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] scsi: Fix max transfer length for 4k disks

2015-01-29 Thread Paolo Bonzini



On 29/01/2015 22:54, Brian King wrote:
 The following patch fixes an issue observed with 4k sector disks
 where the max_hw_sectors attribute was getting set too large in
 sd_revalidate_disk. Since sdkp-max_xfer_blocks is in units
 of SCSI logical blocks and queue_max_hw_sectors is in units of
 512 byte blocks, on a 4k sector disk, every time we went through
 sd_revalidate_disk, we were taking the current value of
 queue_max_hw_sectors and increasing it by a factor of 8. Fix
 this by only shifting sdkp-max_xfer_blocks.
 
 Cc: stablesta...@vger.kernel.org
 Signed-off-by: Brian King brk...@linux.vnet.ibm.com
 ---
 
  drivers/scsi/sd.c |6 --
  1 file changed, 4 insertions(+), 2 deletions(-)
 
 diff -puN drivers/scsi/sd.c~sd_revalidate_4k drivers/scsi/sd.c
 --- linux/drivers/scsi/sd.c~sd_revalidate_4k  2015-01-29 14:44:23.316171187 
 -0600
 +++ linux-bjking1/drivers/scsi/sd.c   2015-01-29 14:51:05.846126392 -0600
 @@ -2800,9 +2800,11 @@ static int sd_revalidate_disk(struct gen
*/
   sd_set_flush_flag(sdkp);
  
 - max_xfer = min_not_zero(queue_max_hw_sectors(sdkp-disk-queue),
 - sdkp-max_xfer_blocks);
 + max_xfer = sdkp-max_xfer_blocks;
   max_xfer = ilog2(sdp-sector_size) - 9;
 +
 + max_xfer = min_not_zero(queue_max_hw_sectors(sdkp-disk-queue),
 + max_xfer);
   blk_queue_max_hw_sectors(sdkp-disk-queue, max_xfer);
   set_capacity(disk, sdkp-capacity);
   sd_config_write_same(sdkp);
 _
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] SCSI: SD: set max_ws_blocks as max_unmap_blocks if it isn't provided

2014-12-05 Thread Paolo Bonzini



On 05/12/2014 14:58, Martin K. Petersen wrote:
 Ming == Ming Lei ming@canonical.com writes:
 
 What about in READ CAPACITY(16)?
 
 Ming It isn't set too.
 
 Please try the following patch:
 
 
 [SCSI] sd: Tweak discard heuristics to work around QEMU SCSI issue
 
 7985090aa020 changed the discard heuristics to give preference to the
 WRITE SAME commands that (unlike UNMAP) guarantee deterministic results.
 
 Ming Lei discovered that QEMU SCSI's WRITE SAME implementation
 internally relied on limits that were only communicated for the UNMAP
 case. And therefore discard commands backed by WRITE SAME would fail.
 
 Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only
 prefer the WRITE SAME variants if the device has the LBPRZ flag set.
 
 Reported-by: Ming Lei ming@canonical.com
 Signed-off-by: Martin K. Petersen martin.peter...@oracle.com
 
 diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
 index 95bfb7bfbb9d..76c5d1388417 100644
 --- a/drivers/scsi/sd.c
 +++ b/drivers/scsi/sd.c
 @@ -2623,8 +2623,9 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
   sd_config_discard(sdkp, SD_LBP_WS16);
  
   } else {/* LBP VPD page tells us what to use */
 -
 - if (sdkp-lbpws)
 + if (sdkp-lbpu  sdkp-max_unmap_blocks  
 !sdkp-lbprz)
 + sd_config_discard(sdkp, SD_LBP_UNMAP);
 + else if (sdkp-lbpws)
   sd_config_discard(sdkp, SD_LBP_WS16);
   else if (sdkp-lbpws10)
   sd_config_discard(sdkp, SD_LBP_WS10);
 

This is the right fix.  Ming, how do you reproduce the QEMU bug?

Acked-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] SCSI: SD: set max_ws_blocks as max_unmap_blocks if it isn't provided

2014-12-05 Thread Paolo Bonzini



On 05/12/2014 14:05, Ming Lei wrote:
 
 [   50.112885] sd 0:0:1:0: [sda] FAILED Result: hostbyte=DID_OK
 driverbyte=DRIVER_SENSE
 [   50.113859] sd 0:0:1:0: [sda] Sense Key : Illegal Request [current]
 [   50.113859] sd 0:0:1:0: [sda] Add. Sense: Invalid field in cdb
 [   50.113859] sd 0:0:1:0: [sda] CDB:
 [   50.113859] Write same(16): 93 08 00 00 00 00 00 00 80 00 00 40 00 00 00 00
 [   50.113859] blk_update_request: critical target error, dev sda, sector
 32768

So this command is zeroing 2^22 sectors (2GB) starting from sector 128.
 How did you run QEMU and what command produced this request?

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] libata: Whitelist SSDs that are known to properly return zeroes after TRIM

2014-12-05 Thread Paolo Bonzini



On 07/11/2014 06:08, Martin K. Petersen wrote:
 The whitelist is only meant as a starting point and is by no means
 comprehensive:
 
- All intel SSD models except for 510
- Micron M5*
- Samsung SSDs
- Seagate SSDs
 
 Signed-off-by: Martin K. Petersen martin.peter...@oracle.com
 ---
  drivers/ata/libata-core.c | 18 ++
  drivers/ata/libata-scsi.c | 10 ++
  include/linux/libata.h|  1 +
  3 files changed, 21 insertions(+), 8 deletions(-)
 
 diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
 index c5ba15af87d3..f41f24a8bc21 100644
 --- a/drivers/ata/libata-core.c
 +++ b/drivers/ata/libata-core.c
 @@ -4225,10 +4225,20 @@ static const struct ata_blacklist_entry 
 ata_device_blacklist [] = {
   { PIONEER DVD-RW  DVR-216D,   NULL,   ATA_HORKAGE_NOSETXFER },
  
   /* devices that don't properly handle queued TRIM commands */
 - { Micron_M500*,   NULL,   ATA_HORKAGE_NO_NCQ_TRIM, },
 - { Crucial_CT???M500SSD*,  NULL,   ATA_HORKAGE_NO_NCQ_TRIM, },
 - { Micron_M550*,   NULL,   ATA_HORKAGE_NO_NCQ_TRIM, },
 - { Crucial_CT*M550SSD*,NULL,   ATA_HORKAGE_NO_NCQ_TRIM, },
 + { Micron_M5?0*,   NULL,   ATA_HORKAGE_NO_NCQ_TRIM |
 + ATA_HORKAGE_ZERO_AFTER_TRIM, },
 + { Crucial_CT???M5?0SSD*,  NULL,   ATA_HORKAGE_NO_NCQ_TRIM, },

I have a Crucial_CT256MX1 (i.e. MX100) and it does reliably zero.

BTW. it's the same hardware as the M550, so probably the same set of
quirks should apply to both.

Paolo

 +
 + /*
 +  * DRAT/RZAT are weak guarantees. Explicitly black/whitelist
 +  * SSDs that provide reliable zero after TRIM.
 +  */
 + { INTEL*SSDSC2MH*,NULL,   0, }, /* Blacklist intel 510 */
 + { INTEL*SSD*, NULL,   ATA_HORKAGE_ZERO_AFTER_TRIM, },
 + { SSD*INTEL*, NULL,   ATA_HORKAGE_ZERO_AFTER_TRIM, },
 + { Samsung*SSD*,   NULL,   ATA_HORKAGE_ZERO_AFTER_TRIM, },
 + { SAMSUNG*SSD*,   NULL,   ATA_HORKAGE_ZERO_AFTER_TRIM, },
 + { ST[1248][0248]0[FH]*,   NULL,   ATA_HORKAGE_ZERO_AFTER_TRIM, },
  
   /*
* Some WD SATA-I drives spin up and down erratically when the link
 diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
 index 0586f66d70fa..deaa6e34ed4d 100644
 --- a/drivers/ata/libata-scsi.c
 +++ b/drivers/ata/libata-scsi.c
 @@ -2515,13 +2515,15 @@ static unsigned int ata_scsiop_read_cap(struct 
 ata_scsi_args *args, u8 *rbuf)
   rbuf[15] = lowest_aligned;
  
   if (ata_id_has_trim(args-id)) {
 - rbuf[14] |= 0x80; /* TPE */
 + rbuf[14] |= 0x80; /* LBPME */
  
 - if (ata_id_has_zero_after_trim(args-id))
 - rbuf[14] |= 0x40; /* TPRZ */
 + if (ata_id_has_zero_after_trim(args-id) 
 + dev-horkage  ATA_HORKAGE_ZERO_AFTER_TRIM) {
 + ata_dev_warn(dev, Enabling 
 discard_zeroes_data\n);
 + rbuf[14] |= 0x40; /* LBPRZ */
 + }
   }
   }
 -
   return 0;
  }
  
 diff --git a/include/linux/libata.h b/include/linux/libata.h
 index bd5fefeaf548..45ac825b8366 100644
 --- a/include/linux/libata.h
 +++ b/include/linux/libata.h
 @@ -421,6 +421,7 @@ enum {
   ATA_HORKAGE_NO_NCQ_TRIM = (1  19),/* don't use queued TRIM */
   ATA_HORKAGE_NOLPM   = (1  20),/* don't use LPM */
   ATA_HORKAGE_WD_BROKEN_LPM = (1  21),  /* some WDs have broken LPM */
 + ATA_HORKAGE_ZERO_AFTER_TRIM = (1  22),/* guarantees zero after trim */
  
/* DMA mask for user DMA control: User visible values; DO NOT
   renumber */
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/10] esp_scsi: spellcheck 'driver'

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/esp_scsi.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h
 index cd68805..b5862e4 100644
 --- a/drivers/scsi/esp_scsi.h
 +++ b/drivers/scsi/esp_scsi.h
 @@ -1,4 +1,4 @@
 -/* esp_scsi.h: Defines and structures for the ESP drier.
 +/* esp_scsi.h: Defines and structures for the ESP driver.
   *
   * Copyright (C) 2007 David S. Miller (da...@davemloft.net)
   */
 

A good start. :)

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/10] esp_scsi: make number of tags configurable

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 Add a field 'num_tags' to the esp structure to allow drivers
 to overwrite the number of avialable tags if required.
 Default is ESP_DEFAULT_TAGS.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/esp_scsi.c | 10 --
  drivers/scsi/esp_scsi.h |  3 +--
  2 files changed, 5 insertions(+), 8 deletions(-)
 
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index 38c23e0..99cd42f 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -2317,6 +2317,8 @@ int scsi_esp_register(struct esp *esp, struct device 
 *dev)
   static int instance;
   int err;
  
 + if (!esp-num_tags)
 + esp-num_tags = ESP_DEFAULT_TAGS;
   esp-host-transportt = esp_transport_template;
   esp-host-max_lun = ESP_MAX_LUN;
   esp-host-cmd_per_lun = 2;
 @@ -2403,12 +2405,8 @@ static int esp_slave_configure(struct scsi_device *dev)
   struct esp *esp = shost_priv(dev-host);
   struct esp_target_data *tp = esp-target[dev-id];
  
 - if (dev-tagged_supported) {
 - /* XXX make this configurable somehow XXX */
 - int goal_tags = min(ESP_DEFAULT_TAGS, ESP_MAX_TAG);

min(16, 256) = 16

Might add a WARN_ON(esp-num_tags  ESP_MAX_TAG) after you assign
esp-num_tags.


 - scsi_adjust_queue_depth(dev, goal_tags);
 - }
 + if (dev-tagged_supported)
 + scsi_adjust_queue_depth(dev, esp-num_tags);
  
   tp-flags |= ESP_TGT_DISCONNECT;
  
 diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h
 index b5862e4..975d293 100644
 --- a/drivers/scsi/esp_scsi.h
 +++ b/drivers/scsi/esp_scsi.h
 @@ -283,7 +283,6 @@ struct esp_cmd_entry {
   struct completion   *eh_done;
  };
  
 -/* XXX make this configurable somehow XXX */
  #define ESP_DEFAULT_TAGS 16
  
  #define ESP_MAX_TARGET   16
 @@ -445,7 +444,7 @@ struct esp {
   u8  prev_soff;
   u8  prev_stp;
   u8  prev_cfg3;
 - u8  __pad;
 + u8  num_tags;
  
   struct list_headesp_cmd_pool;
  
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 04/10] esp_scsi: debug event and command

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 Add new debug definitions for event and command logging.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/esp_scsi.c | 15 +++
  1 file changed, 15 insertions(+)
 
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index 492c51b..fe3278e 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -49,6 +49,8 @@ static u32 esp_debug;
  #define ESP_DEBUG_DATADONE   0x0100
  #define ESP_DEBUG_RECONNECT  0x0200
  #define ESP_DEBUG_AUTOSENSE  0x0400
 +#define ESP_DEBUG_EVENT  0x0800
 +#define ESP_DEBUG_COMMAND0x1000
  
  #define esp_log_intr(f, a...) \
  do { if (esp_debug  ESP_DEBUG_INTR) \
 @@ -100,6 +102,16 @@ do { if (esp_debug  ESP_DEBUG_AUTOSENSE) \
   shost_printk(KERN_DEBUG, esp-host, f, ## a);   \
  } while (0)
  
 +#define esp_log_event(f, a...) \
 +do {   if (esp_debug  ESP_DEBUG_EVENT)  \
 + shost_printk(KERN_DEBUG, esp-host, f, ## a);   \
 +} while (0)
 +
 +#define esp_log_command(f, a...) \
 +do {   if (esp_debug  ESP_DEBUG_COMMAND)\
 + shost_printk(KERN_DEBUG, esp-host, f, ## a);   \
 +} while (0)
 +
  #define esp_read8(REG)   esp-ops-esp_read8(esp, REG)
  #define esp_write8(VAL,REG)  esp-ops-esp_write8(esp, VAL, REG)
  
 @@ -126,6 +138,7 @@ void scsi_esp_cmd(struct esp *esp, u8 val)
  
   esp-esp_event_cur = (idx + 1)  (ESP_EVENT_LOG_SZ - 1);
  
 + esp_log_command(cmd[%02x]\n, val);
   esp_write8(val, ESP_CMD);
  }
  EXPORT_SYMBOL(scsi_esp_cmd);
 @@ -1638,6 +1651,8 @@ static int esp_process_event(struct esp *esp)
  
  again:
   write = 0;
 + esp_log_event(process event %d phase %x\n,
 +   esp-event, esp-sreg  ESP_STAT_PMASK);
   switch (esp-event) {
   case ESP_EVENT_CHECK_PHASE:
   switch (esp-sreg  ESP_STAT_PMASK) {
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/10] esp_scsi: convert to dev_printk

2014-11-21 Thread Paolo Bonzini

, esp-dma_regs,
 +esp-host-irq);
 + dev_printk(KERN_INFO, dev,
 +esp%u: is a %s, %u MHz (ccf=%u), SCSI ID %u\n,
 +esp-host-unique_id, esp_chip_names[esp-rev],
 +esp-cfreq / 100, esp-cfact, esp-scsi_id);
  
   /* Let the SCSI bus reset settle. */
   ssleep(esp_bus_reset_settle);
 @@ -2435,19 +2434,20 @@ static int esp_eh_abort_handler(struct scsi_cmnd *cmd)
* XXX much for the final driver.
*/
   spin_lock_irqsave(esp-host-host_lock, flags);
 - printk(KERN_ERR PFX esp%d: Aborting command [%p:%02x]\n,
 -esp-host-unique_id, cmd, cmd-cmnd[0]);
 + shost_printk(KERN_ERR, esp-host, Aborting command [%p:%02x]\n,
 +  cmd, cmd-cmnd[0]);
   ent = esp-active_cmd;
   if (ent)
 - printk(KERN_ERR PFX esp%d: Current command [%p:%02x]\n,
 -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]);
 + shost_printk(KERN_ERR, esp-host,
 +  Current command [%p:%02x]\n,
 +  ent-cmd, ent-cmd-cmnd[0]);
   list_for_each_entry(ent, esp-queued_cmds, list) {
 - printk(KERN_ERR PFX esp%d: Queued command [%p:%02x]\n,
 -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]);
 + shost_printk(KERN_ERR, esp-host, Queued command [%p:%02x]\n,
 +  ent-cmd, ent-cmd-cmnd[0]);
   }
   list_for_each_entry(ent, esp-active_cmds, list) {
 - printk(KERN_ERR PFX esp%d: Active command [%p:%02x]\n,
 -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]);
 + shost_printk(KERN_ERR, esp-host,  Active command [%p:%02x]\n,
 +  ent-cmd, ent-cmd-cmnd[0]);
   }
   esp_dump_cmd_log(esp);
   spin_unlock_irqrestore(esp-host-host_lock, flags);
 

Looks good.

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/10] esp_scsi: read status registers

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 A read to ESP_INTRPT will clear ESP_STATUS and ESP_SSTEP. So read
 all status registers in one go to avoid losing information.

(ESP_STAT_TCNT is actually kept in the status register, it is cleared by
writing TCLO/MID/HI).

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/esp_scsi.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index fe3278e..92ab921 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -982,7 +982,6 @@ static int esp_check_spur_intr(struct esp *esp)
  
   default:
   if (!(esp-sreg  ESP_STAT_INTR)) {
 - esp-ireg = esp_read8(ESP_INTRPT);
   if (esp-ireg  ESP_INTR_SR)
   return 1;
  
 @@ -2056,7 +2055,12 @@ static void __esp_interrupt(struct esp *esp)
   int finish_reset, intr_done;
   u8 phase;
  
 +   /*
 + * Once INTRPT is read STATUS and SSTEP are cleared.
 + */
   esp-sreg = esp_read8(ESP_STATUS);
 + esp-seqreg = esp_read8(ESP_SSTEP);
 + esp-ireg = esp_read8(ESP_INTRPT);
  
   if (esp-flags  ESP_FLAG_RESETTING) {
   finish_reset = 1;
 @@ -2069,8 +2073,6 @@ static void __esp_interrupt(struct esp *esp)
   return;
   }
  
 - esp-ireg = esp_read8(ESP_INTRPT);
 -
   if (esp-ireg  ESP_INTR_SR)
   finish_reset = 1;
  
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/10] esp: Use FIFO for command submission

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 The am53c974 has a design flaw causing it to throw an
 DMA interrupt whenever a DMA transmission completed,
 even though DMA interrupt reporting is disabled.
 This confuses the esp routines as it would register
 a DMA interrupt whenever a cdb has been transmitted
 to the drive.
 So implement an option to use the FIFO for command
 submission and enable it for am53c974.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/am53c974.c |  1 +
  drivers/scsi/esp_scsi.c | 83 
 ++---
  drivers/scsi/esp_scsi.h |  1 +
  3 files changed, 66 insertions(+), 19 deletions(-)
 
 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 index b9af8b0..652762e 100644
 --- a/drivers/scsi/am53c974.c
 +++ b/drivers/scsi/am53c974.c
 @@ -401,6 +401,7 @@ static int pci_esp_probe_one(struct pci_dev *pdev,
   esp-host = shost;
   esp-dev = pdev;
   esp-ops = pci_esp_ops;
 + esp-flags |= ESP_FLAG_USE_FIFO;

Why not invert this patch and the previous one, and add this line
directly when the am53c974 driver is born?

   pep-esp = esp;
  
   if (pci_request_regions(pdev, DRV_MODULE_NAME)) {
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index 92ab921..ab7d2bc 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -605,7 +605,7 @@ static void esp_autosense(struct esp *esp, struct 
 esp_cmd_entry *ent)
  {
   struct scsi_cmnd *cmd = ent-cmd;
   struct scsi_device *dev = cmd-device;
 - int tgt, lun;
 + int tgt, lun, i;
   u8 *p, val;
  
   tgt = dev-id;
 @@ -626,7 +626,10 @@ static void esp_autosense(struct esp *esp, struct 
 esp_cmd_entry *ent)
  
   esp-active_cmd = ent;
  
 - p = esp-command_block;
 + if (esp-flags  ESP_FLAG_USE_FIFO)
 + p = esp-fifo;
 + else
 + p = esp-command_block;

Any reason not to use esp-command_block unconditionally?

   esp-msg_out_len = 0;
  
   *p++ = IDENTIFY(0, lun);
 @@ -648,12 +651,21 @@ static void esp_autosense(struct esp *esp, struct 
 esp_cmd_entry *ent)
   esp_write_tgt_sync(esp, tgt);
   esp_write_tgt_config3(esp, tgt);
  
 - val = (p - esp-command_block);
 + if (esp-flags  ESP_FLAG_USE_FIFO)
 + val = (p - esp-fifo);
 + else
 + val = (p - esp-command_block);
  
   if (esp-rev == FASHME)
   scsi_esp_cmd(esp, ESP_CMD_FLUSH);

For consistency with what you do elsewhere, please move this in the else.

 - esp-ops-send_dma_cmd(esp, esp-command_block_dma,
 -val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA);
 + if (esp-flags  ESP_FLAG_USE_FIFO) {
 + scsi_esp_cmd(esp, ESP_CMD_FLUSH);
 + for (i = 0; i  val; i++)
 + esp_write8(esp-fifo[i], ESP_FDATA);
 + scsi_esp_cmd(esp, ESP_CMD_SELA);
 + } else
 + esp-ops-send_dma_cmd(esp, esp-command_block_dma,
 +val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA);
  }

Hmm, what about a wrapper

__send_cmd(esp, data, esp_count, dma_count, cmd, force_flush,
   force_fifo)
{
use_fifo =
force_fifo || (esp-flags  ESP_FLAG_USE_FIFO) ||
esp_count == 1;
if (force_flush || use_fifo || esp-rev == FASHME)
scsi_esp_cmd(esp, ESP_CMD_FLUSH);
if (use_fifo) {
for (i = 0; i  val; i++)
esp_write8(esp-fifo[i], ESP_FDATA);
scsi_esp_cmd(esp, cmd);
} else {
if (data != esp-command_block)
memcpy(esp-command_block, data, length)
esp-ops-send_dma_cmd(esp,
   esp-command_block_dma,
   esp_count, dma_count, 0,
   cmd | ESP_CMD_DMA);
}
}

send_cmd(esp, data, esp_count, dma_count, cmd)
{
__send_cmd(esp, data, esp_count, dma_count, cmd, false, false);
}

This would be:

send_cmd(esp, esp-command_block, val, 16, ESP_CMD_SELA);

  static struct esp_cmd_entry *find_and_prep_issuable_command(struct esp *esp)
 @@ -727,7 +739,10 @@ static void esp_maybe_execute_command(struct esp *esp)
  
   esp_check_command_len(esp, cmd);
  
 - p = esp-command_block;
 + if (esp-flags  ESP_FLAG_USE_FIFO)
 + p = esp-fifo;

Any reason not to use esp-command_block unconditionally?

 + else
 + p = esp-command_block;
  
   esp-msg_out_len = 0;
   if (tp-flags  ESP_TGT_CHECK_NEGO) {
 @@ -789,13 +804,15 @@ build_identify:
   }
  
   if (!(esp-flags  ESP_FLAG_DOING_SLOWCMD)) {
 - start_cmd = ESP_CMD_DMA | ESP_CMD_SELA;
 + start_cmd = ESP_CMD_SELA;
   if (ent-tag[0]) {
   *p++ = ent-tag[0];
   *p++ = ent-tag[1];
  
 - start_cmd = ESP_CMD_DMA | ESP_CMD_SA3;
 +

Re: [PATCH 10/10] esp: enable CONFIG2_FENAB for am53c974

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 CONFIG2_FENAB ('feature enable') changed definition between chip
 revisions, from 'Latch SCSI Phase' to 'Latch SCSI Phase, display
 chip ID upon reset, and enable 24 bit addresses'.
 So only enable it for am53c974 where we know what it's doing.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/am53c974.c | 30 ++
  drivers/scsi/esp_scsi.c |  4 
  2 files changed, 34 insertions(+)
 
 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 index 0452ed1..722e781 100644
 --- a/drivers/scsi/am53c974.c
 +++ b/drivers/scsi/am53c974.c
 @@ -252,6 +252,8 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 
 addr, u32 esp_count,
  
   pci_esp_write8(esp, (esp_count  0)  0xff, ESP_TCLOW);
   pci_esp_write8(esp, (esp_count  8)  0xff, ESP_TCMED);
 + if (esp-config2  ESP_CONFIG2_FENAB)
 + pci_esp_write8(esp, (esp_count  16)  0xff, ESP_TCHI);

Why do this conditionally?  We know that FENAB is true here, don't we?

(Maybe I'm missing something obvious though).

Paolo
  
   pci_esp_write32(esp, esp_count, ESP_DMA_STC);
   pci_esp_write32(esp, addr, ESP_DMA_SPA);
 @@ -265,6 +267,33 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 
 addr, u32 esp_count,
   pci_esp_write8(esp, ESP_DMA_CMD_START | val, ESP_DMA_CMD);
  }
  
 +static u32 pci_esp_dma_length_limit(struct esp *esp, u32 dma_addr, u32 
 dma_len)
 +{
 + int dma_limit = 16;
 + u32 base, end;
 +
 + /*
 +  * If CONFIG2_FENAB is set we can
 +  * handle up to 24 bit addresses
 +  */
 + if (esp-config2  ESP_CONFIG2_FENAB)
 + dma_limit = 24;
 +
 + if (dma_len  (1U  dma_limit))
 + dma_len = (1U  dma_limit);
 +
 + /*
 +  * Prevent crossing a 24-bit address boundary.
 +  */
 + base = dma_addr  ((1U  24) - 1U);
 + end = base + dma_len;
 + if (end  (1U  24))
 + end = (1U 24);
 + dma_len = end - base;
 +
 + return dma_len;
 +}
 +
  static const struct esp_driver_ops pci_esp_ops = {
   .esp_write8 =   pci_esp_write8,
   .esp_read8  =   pci_esp_read8,
 @@ -278,6 +307,7 @@ static const struct esp_driver_ops pci_esp_ops = {
   .dma_invalidate =   pci_esp_dma_invalidate,
   .send_dma_cmd   =   pci_esp_send_dma_cmd,
   .dma_error  =   pci_esp_dma_error,
 + .dma_length_limit = pci_esp_dma_length_limit,
  };
  
  /*
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index 01753f5..d0b7b32 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -289,6 +289,8 @@ static void esp_reset_esp(struct esp *esp)
  
   case FAS236:
   case PCSCSI:
 + if (esp-rev == PCSCSI)
 + esp-config2 |= ESP_CONFIG2_FENAB;
   /* Fast 236, AM53c974 or HME */
   esp_write8(esp-config2, ESP_CFG2);
   if (esp-rev == FASHME) {
 @@ -1362,6 +1364,8 @@ static int esp_data_bytes_sent(struct esp *esp, struct 
 esp_cmd_entry *ent,
 (((unsigned int)esp_read8(ESP_TCMED))  8));
   if (esp-rev == FASHME)
   ecount |= ((unsigned int)esp_read8(FAS_RLO))  16;
 + if (esp-rev == PCSCSI  (esp-config2  ESP_CONFIG2_FENAB))
 + ecount |= ((unsigned int)esp_read8(ESP_TCHI))  16;
   }
  
   bytes_sent = esp-data_dma_len;
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 08/10] am53c974: BLAST residual handling

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 The am53c974 has an design issue where a single byte might be
 left in the SCSI FIFO after a DMA transfer.
 As the handling code is currently untested add a WARN_ON()
 statement here.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/am53c974.c |  6 ++
  drivers/scsi/esp_scsi.c | 29 +
  drivers/scsi/esp_scsi.h |  1 +
  3 files changed, 36 insertions(+)
 
 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 index 652762e..0b7643e 100644
 --- a/drivers/scsi/am53c974.c
 +++ b/drivers/scsi/am53c974.c
 @@ -195,6 +195,12 @@ static void pci_esp_dma_drain(struct esp *esp)
   shost_printk(KERN_INFO, esp-host,
DMA blast done (%d tries, %d bytes left)\n, lim, resid);
  #endif
 + /* BLAST residual handling is currently untested */
 + if (WARN_ON(resid == 1)) {

WARN_ON_ONCE?

Paolo

 + struct esp_cmd_entry *ent = esp-active_cmd;
 +
 + ent-flags |= ESP_CMD_FLAG_RESIDUAL;
 + }
  }
  
  static void pci_esp_dma_invalidate(struct esp *esp)
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index ab7d2bc..88272bb 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -1353,6 +1353,35 @@ static int esp_data_bytes_sent(struct esp *esp, struct 
 esp_cmd_entry *ent,
   bytes_sent = esp-data_dma_len;
   bytes_sent -= ecount;
  
 + /*
 +  * The am53c974 has a DMA 'pecularity'. The doc states:
 +  * In some odd byte conditions, one residual byte will
 +  * be left in the SCSI FIFO, and the FIFO Flags will
 +  * never count to '0 '. When this happens, the residual
 +  * byte should be retrieved via PIO following completion
 +  * of the BLAST operation.
 +  */
 + if (fifo_cnt == 1  ent-flags  ESP_CMD_FLAG_RESIDUAL) {
 + size_t count = 1;
 + size_t offset = bytes_sent;
 + u8 bval = esp_read8(ESP_FDATA);
 +
 + if (ent-flags  ESP_CMD_FLAG_AUTOSENSE)
 + ent-sense_ptr[bytes_sent] = bval;
 + else {
 + struct esp_cmd_priv *p = ESP_CMD_PRIV(cmd);
 + u8 *ptr;
 +
 + ptr = scsi_kmap_atomic_sg(p-cur_sg, p-u.num_sg,
 +   offset, count);
 + if (likely(ptr)) {
 + *(ptr + offset) = bval;
 + scsi_kunmap_atomic_sg(ptr);
 + }
 + }
 + bytes_sent += fifo_cnt;
 + ent-flags = ~ESP_CMD_FLAG_RESIDUAL;
 + }
   if (!(ent-flags  ESP_CMD_FLAG_WRITE))
   bytes_sent -= fifo_cnt;
  
 diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h
 index 27dcaf8..5fa456c 100644
 --- a/drivers/scsi/esp_scsi.h
 +++ b/drivers/scsi/esp_scsi.h
 @@ -269,6 +269,7 @@ struct esp_cmd_entry {
  #define ESP_CMD_FLAG_WRITE   0x01 /* DMA is a write */
  #define ESP_CMD_FLAG_ABORT   0x02 /* being aborted */
  #define ESP_CMD_FLAG_AUTOSENSE   0x04 /* Doing automatic REQUEST_SENSE */
 +#define ESP_CMD_FLAG_RESIDUAL0x08 /* AM53c974 BLAST residual */
  
   u8  tag[2];
   u8  orig_tag[2];
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/10] esp: correctly detect am53c974

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 The am53c974 returns the same ID as the FAS236, but implements
 things slightly differently. So detect the am53c974 by checking
 for ESP_CONFIG4 register.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/am53c974.c |  2 ++
  drivers/scsi/esp_scsi.c | 17 -
  drivers/scsi/esp_scsi.h | 15 +++
  3 files changed, 33 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 index 0b7643e..0452ed1 100644
 --- a/drivers/scsi/am53c974.c
 +++ b/drivers/scsi/am53c974.c
 @@ -365,6 +365,8 @@ static void dc390_check_eeprom(struct esp *esp)
   }
   esp-scsi_id = EEbuf[DC390_EE_ADAPT_SCSI_ID];
   esp-num_tags = 2  EEbuf[DC390_EE_TAG_CMD_NUM];
 + if (EEbuf[DC390_EE_MODE2]  DC390_EE_MODE2_ACTIVE_NEGATION)
 + esp-config4 |= ESP_CONFIG4_RADE | ESP_CONFIG4_RAE;
  }
  
  static int pci_esp_probe_one(struct pci_dev *pdev,
 diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
 index 88272bb..01753f5 100644
 --- a/drivers/scsi/esp_scsi.c
 +++ b/drivers/scsi/esp_scsi.c
 @@ -250,6 +250,19 @@ static void esp_reset_esp(struct esp *esp)
   } else {
   esp-min_period = ((5 * esp-ccycle) / 1000);
   }
 + if (esp-rev == FAS236) {
 + /*
 +  * The AM53c974 chip returns the same ID as FAS236;
 +  * try to configure glitch eater.
 +  */
 + u8 config4 = ESP_CONFIG4_GE1;
 + esp_write8(config4, ESP_CFG4);
 + config4 = esp_read8(ESP_CFG4);
 + if ((config4  ESP_CONFIG4_GE1) == ESP_CONFIG4_GE1) {
 + esp-rev = PCSCSI;
 + esp_write8(esp-config4, ESP_CFG4);
 + }
 + }
   esp-max_period = (esp-max_period + 3)2;
   esp-min_period = (esp-min_period + 3)2;
  
 @@ -275,7 +288,8 @@ static void esp_reset_esp(struct esp *esp)
   /* fallthrough... */
  
   case FAS236:
 - /* Fast 236 or HME */
 + case PCSCSI:
 + /* Fast 236, AM53c974 or HME */
   esp_write8(esp-config2, ESP_CFG2);
   if (esp-rev == FASHME) {
   u8 cfg3 = esp-target[0].esp_config3;
 @@ -2397,6 +2411,7 @@ static const char *esp_chip_names[] = {
   FAS100A,
   FAST,
   FASHME,
 + AM53C974,
  };
  
  static struct scsi_transport_template *esp_transport_template;
 diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h
 index 5fa456c..84dcbe4 100644
 --- a/drivers/scsi/esp_scsi.h
 +++ b/drivers/scsi/esp_scsi.h
 @@ -25,6 +25,7 @@
  #define ESP_CTEST0x0aUL  /* wo  Chip test register  0x28  */
  #define ESP_CFG2 0x0bUL  /* rw  Second cfg register 0x2c  */
  #define ESP_CFG3 0x0cUL  /* rw  Third cfg register  0x30  */
 +#define ESP_CFG4 0x0dUL  /* rw  Fourth cfg register 0x34  */
  #define ESP_TCHI 0x0eUL  /* rw  High bits transf count  0x38  */
  #define ESP_UID  ESP_TCHI/* ro  Unique ID code  
 0x38  */
  #define FAS_RLO  ESP_TCHI/* rw  HME extended counter
 0x38  */
 @@ -76,6 +77,18 @@
  #define ESP_CONFIG3_IMS   0x80 /* ID msg chk'ng(esp/fas236)  
 */
  #define ESP_CONFIG3_OBPUSH0x80 /* Push odd-byte to dma (hme) 
 */
  
 +/* ESP config register 4 read-write, found only on am53c974 chips */
 +#define ESP_CONFIG4_RADE  0x04 /* Active negation */
 +#define ESP_CONFIG4_RAE   0x08 /* Active negation on REQ and ACK */
 +#define ESP_CONFIG4_PWD   0x20 /* Reduced power feature */
 +#define ESP_CONFIG4_GE0   0x40 /* Glitch eater bit 0 */
 +#define ESP_CONFIG4_GE1   0x80 /* Glitch eater bit 1 */
 +
 +#define ESP_CONFIG_GE_12NS(0)
 +#define ESP_CONFIG_GE_25NS(ESP_CONFIG_GE1)
 +#define ESP_CONFIG_GE_35NS(ESP_CONFIG_GE0)
 +#define ESP_CONFIG_GE_0NS (ESP_CONFIG_GE0 | ESP_CONFIG_GE1)
 +
  /* ESP command register read-write */
  /* Group 1 commands:  These may be sent at any point in time to the ESP
   *chip.  None of them can generate interrupts 'cept
 @@ -254,6 +267,7 @@ enum esp_rev {
   FAS100A= 0x04,
   FAST   = 0x05,
   FASHME = 0x06,
 + PCSCSI = 0x07,  /* AM53c974 */
  };
  
  struct esp_cmd_entry {
 @@ -466,6 +480,7 @@ struct esp {
   u8  bursts;
   u8  config1;
   u8  config2;
 + u8  config4;
  
   u8  scsi_id;
   u32 scsi_id_mask;
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/10] scsi: add 'am53c974' driver

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 10:27, Hannes Reinecke wrote:
 This patch adds a new implementation for the Tekram DC-390T /
 AMD AM53c974 SCSI controller, based on the generic
 esp_scsi infrastructure.
 
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/Kconfig|  18 ++
  drivers/scsi/Makefile   |   1 +
  drivers/scsi/am53c974.c | 523 
 
  3 files changed, 542 insertions(+)
  create mode 100644 drivers/scsi/am53c974.c
 
 diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
 index 21bc674..497e7d5 100644
 --- a/drivers/scsi/Kconfig
 +++ b/drivers/scsi/Kconfig
 @@ -1357,6 +1357,24 @@ config SCSI_DC390T
 To compile this driver as a module, choose M here: the
 module will be called tmscsim.
  
 +config SCSI_AM53C974
 + tristate Tekram DC390(T) and Am53/79C974 SCSI support (new driver)
 + depends on PCI  SCSI

Perhaps make it a choice with SCSI_DC390, since they match on the same
PCI vendor/device IDs?

Paolo

 + select SCSI_SPI_ATTRS
 + ---help---
 +   This driver supports PCI SCSI host adapters based on the Am53C974A
 +   chip, e.g. Tekram DC390(T), DawiControl 2974 and some onboard
 +   PCscsi/PCnet (Am53/79C974) solutions.
 +   This is a new implementation base on the generic esp_scsi driver.
 +
 +   Documentation can be found in file:Documentation/scsi/tmscsim.txt.
 +
 +   Note that this driver does NOT support Tekram DC390W/U/F, which are
 +   based on NCR/Symbios chips. Use NCR53C8XX SCSI support for those.
 +
 +   To compile this driver as a module, choose M here: the
 +   module will be called am53c974.
 +
  config SCSI_T128
   tristate Trantor T128/T128F/T228 SCSI support
   depends on ISA  SCSI
 diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
 index 76f8932..7f974fc 100644
 --- a/drivers/scsi/Makefile
 +++ b/drivers/scsi/Makefile
 @@ -101,6 +101,7 @@ obj-$(CONFIG_SCSI_7000FASST)  += wd7000.o
  obj-$(CONFIG_SCSI_EATA)  += eata.o
  obj-$(CONFIG_SCSI_DC395x)+= dc395x.o
  obj-$(CONFIG_SCSI_DC390T)+= tmscsim.o
 +obj-$(CONFIG_SCSI_AM53C974)  += esp_scsi.o   am53c974.o
  obj-$(CONFIG_MEGARAID_LEGACY)+= megaraid.o
  obj-$(CONFIG_MEGARAID_NEWGEN)+= megaraid/
  obj-$(CONFIG_MEGARAID_SAS)   += megaraid/
 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 new file mode 100644
 index 000..b9af8b0
 --- /dev/null
 +++ b/drivers/scsi/am53c974.c
 @@ -0,0 +1,523 @@
 +/*
 + * AMD am53c974 driver.
 + * Copyright (c) 2014 Hannes Reinecke, SUSE Linux GmbH
 + */
 +
 +#include linux/kernel.h
 +#include linux/module.h
 +#include linux/init.h
 +#include linux/delay.h
 +#include linux/pci.h
 +#include linux/interrupt.h
 +
 +#include scsi/scsi_host.h
 +
 +#include esp_scsi.h
 +
 +#define DRV_MODULE_NAME am53c974
 +#define DRV_MODULE_VERSION 1.00
 +
 +// #define ESP_DMA_DEBUG
 +
 +#define ESP_DMA_CMD 0x10
 +#define ESP_DMA_STC 0x11
 +#define ESP_DMA_SPA 0x12
 +#define ESP_DMA_WBC 0x13
 +#define ESP_DMA_WAC 0x14
 +#define ESP_DMA_STATUS 0x15
 +#define ESP_DMA_SMDLA 0x16
 +#define ESP_DMA_WMAC 0x17
 +
 +#define ESP_DMA_CMD_IDLE 0x00
 +#define ESP_DMA_CMD_BLAST 0x01
 +#define ESP_DMA_CMD_ABORT 0x02
 +#define ESP_DMA_CMD_START 0x03
 +#define ESP_DMA_CMD_MASK  0x03
 +#define ESP_DMA_CMD_DIAG 0x04
 +#define ESP_DMA_CMD_MDL 0x10
 +#define ESP_DMA_CMD_INTE_P 0x20
 +#define ESP_DMA_CMD_INTE_D 0x40
 +#define ESP_DMA_CMD_DIR 0x80
 +
 +#define ESP_DMA_STAT_PWDN 0x01
 +#define ESP_DMA_STAT_ERROR 0x02
 +#define ESP_DMA_STAT_ABORT 0x04
 +#define ESP_DMA_STAT_DONE 0x08
 +#define ESP_DMA_STAT_SCSIINT 0x10
 +#define ESP_DMA_STAT_BCMPLT 0x20
 +
 +/* EEPROM is accessed with 16-bit values */
 +#define DC390_EEPROM_READ 0x80
 +#define DC390_EEPROM_LEN 0x40
 +
 +/*
 + * DC390 EEPROM
 + *
 + * 8 * 4 bytes of per-device options
 + * followed by HBA specific options
 + */
 +
 +/* Per-device options */
 +#define DC390_EE_MODE1 0x00
 +#define DC390_EE_SPEED 0x01
 +
 +/* HBA-specific options */
 +#define DC390_EE_ADAPT_SCSI_ID 0x40
 +#define DC390_EE_MODE2 0x41
 +#define DC390_EE_DELAY 0x42
 +#define DC390_EE_TAG_CMD_NUM 0x43
 +
 +#define DC390_EE_MODE1_PARITY_CHK   0x01
 +#define DC390_EE_MODE1_SYNC_NEGO0x02
 +#define DC390_EE_MODE1_EN_DISC  0x04
 +#define DC390_EE_MODE1_SEND_START   0x08
 +#define DC390_EE_MODE1_TCQ  0x10
 +
 +#define DC390_EE_MODE2_MORE_2DRV0x01
 +#define DC390_EE_MODE2_GREATER_1G   0x02
 +#define DC390_EE_MODE2_RST_SCSI_BUS 0x04
 +#define DC390_EE_MODE2_ACTIVE_NEGATION 0x08
 +#define DC390_EE_MODE2_NO_SEEK  0x10
 +#define DC390_EE_MODE2_LUN_CHECK0x20
 +
 +struct pci_esp_priv {
 + struct esp *esp;
 + u8 dma_status;
 +};
 +
 +#define PCI_ESP_GET_PRIV(esp) ((struct pci_esp_priv *) \
 +pci_get_drvdata((struct pci_dev *) \
 +(esp)-dev))
 +
 +static void pci_esp_dma_drain(struct esp *esp);
 +
 +static void pci_esp_write8(struct esp *esp, u8

Re: [PATCH 06/10] scsi: add 'am53c974' driver

2014-11-21 Thread Paolo Bonzini

Oops, hit send too early.  A small nit:

On 21/11/2014 10:27, Hannes Reinecke wrote:
 +static void pci_esp_dma_drain(struct esp *esp)
 +{
 + u8 resid;
 + int lim = 1000;
 +
 +
 + if ((esp-sreg  ESP_STAT_PMASK) == ESP_DOP ||
 + (esp-sreg  ESP_STAT_PMASK) == ESP_DIP)
 + /* Data-In or Data-Out, nothing to be done */
 + return;
 +
 + while (--lim  0) {
 + resid = pci_esp_read8(esp, ESP_FFLAGS)  ESP_FF_FBYTES;
 + if (resid = 1)
 + break;

cpu_relax()?

 + }
 + if (resid  1) {
 + /* FIFO not cleared */
 + shost_printk(KERN_INFO, esp-host,
 +  FIFO not cleared, %d bytes left\n,
 +  resid);
 + }
 +
 + /*
 +  * When there is a residual BCMPLT will never be set
 +  * (obviously). But we still have to issue the BLAST
 +  * command, otherwise the data will not being transferred.
 +  * But we'll never know when the BLAST operation is
 +  * finished. So check for some time and give up eventually.
 +  */
 + lim = 1000;
 + pci_esp_write8(esp, ESP_DMA_CMD_DIR | ESP_DMA_CMD_BLAST, ESP_DMA_CMD);
 + while (pci_esp_read8(esp, ESP_DMA_STATUS)  ESP_DMA_STAT_BCMPLT) {
 + if (--lim == 0)
 + break;

cpu_relax()?

Otherwise looks sane.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/10] esp: Use FIFO for command submission

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 11:38, Hannes Reinecke wrote:
 esp-msg_out_len = 0;
  
 *p++ = IDENTIFY(0, lun);
 @@ -648,12 +651,21 @@ static void esp_autosense(struct esp *esp, struct 
 esp_cmd_entry *ent)
 esp_write_tgt_sync(esp, tgt);
 esp_write_tgt_config3(esp, tgt);
  
 -   val = (p - esp-command_block);
 +   if (esp-flags  ESP_FLAG_USE_FIFO)
 +   val = (p - esp-fifo);
 +   else
 +   val = (p - esp-command_block);
  
 if (esp-rev == FASHME)
 scsi_esp_cmd(esp, ESP_CMD_FLUSH);

 For consistency with what you do elsewhere, please move this in the else.

 No. The 'USE_FIFO' mechanism is a general feature which _should_
 work on all chips. The above 'if' condition is a chipset-specific
 feature which should be treated separately.

Yup, but USE_FIFO always sends a flush anyway.  Sending two is useless.

 -   esp-ops-send_dma_cmd(esp, esp-command_block_dma,
 -  val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA);
 +   if (esp-flags  ESP_FLAG_USE_FIFO) {
 +   scsi_esp_cmd(esp, ESP_CMD_FLUSH);
 +   for (i = 0; i  val; i++)
 +   esp_write8(esp-fifo[i], ESP_FDATA);
 +   scsi_esp_cmd(esp, ESP_CMD_SELA);
 +   } else
 +   esp-ops-send_dma_cmd(esp, esp-command_block_dma,
 +  val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA);
  }

 Hmm, what about a wrapper

 __send_cmd(esp, data, esp_count, dma_count, cmd, force_flush,
force_fifo)
 {
 use_fifo =
 force_fifo || (esp-flags  ESP_FLAG_USE_FIFO) ||
 esp_count == 1;
 if (force_flush || use_fifo || esp-rev == FASHME)
 scsi_esp_cmd(esp, ESP_CMD_FLUSH);
 if (use_fifo) {
 for (i = 0; i  val; i++)
 esp_write8(esp-fifo[i], ESP_FDATA);
 scsi_esp_cmd(esp, cmd);
 } else {
 if (data != esp-command_block)
 memcpy(esp-command_block, data, length)
 esp-ops-send_dma_cmd(esp,
esp-command_block_dma,
esp_count, dma_count, 0,
cmd | ESP_CMD_DMA);
 }
 }

 send_cmd(esp, data, esp_count, dma_count, cmd)
 {
 __send_cmd(esp, data, esp_count, dma_count, cmd, false, false);
 }

 This would be:

 send_cmd(esp, esp-command_block, val, 16, ESP_CMD_SELA);

 Good point.

(Note this was also why I was suggesting just using esp-command_block
unconditionally).

Paolo

 Will be updating the patch.
 
 Cheers,
 
 Hannes
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/10] esp: enable CONFIG2_FENAB for am53c974

2014-11-21 Thread Paolo Bonzini



On 21/11/2014 11:22, Hannes Reinecke wrote:
 On 11/21/2014 11:08 AM, Paolo Bonzini wrote:


 On 21/11/2014 10:27, Hannes Reinecke wrote:
 CONFIG2_FENAB ('feature enable') changed definition between chip
 revisions, from 'Latch SCSI Phase' to 'Latch SCSI Phase, display
 chip ID upon reset, and enable 24 bit addresses'.
 So only enable it for am53c974 where we know what it's doing.

 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/scsi/am53c974.c | 30 ++
  drivers/scsi/esp_scsi.c |  4 
  2 files changed, 34 insertions(+)

 diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
 index 0452ed1..722e781 100644
 --- a/drivers/scsi/am53c974.c
 +++ b/drivers/scsi/am53c974.c
 @@ -252,6 +252,8 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 
 addr, u32 esp_count,
  
 pci_esp_write8(esp, (esp_count  0)  0xff, ESP_TCLOW);
 pci_esp_write8(esp, (esp_count  8)  0xff, ESP_TCMED);
 +   if (esp-config2  ESP_CONFIG2_FENAB)
 +   pci_esp_write8(esp, (esp_count  16)  0xff, ESP_TCHI);

 Why do this conditionally?  We know that FENAB is true here, don't we?

 (Maybe I'm missing something obvious though).

 Not really. Point is that 'FENAB' does actually three things:
 - Enable TCHI for 24-bit DMA transfer lengths
 - Provide Chip ID in TCHI after reset
 - Latch SCSI phase after completion in SCSI STATUS
 So we _might_ run into timing issues due to the last point, so I've
 made it conditional in case we'd have to disable it.

Maybe make it a module parameter maybe?

Something like this lets you set esp-config2 from the driver.  Look at
it with -b to avoid insanity, it changes half a dozen line modulo the
reindendation.

-- 8 
From: Paolo Bonzini pbonz...@redhat.com
Subject: [PATCH] esp_scsi: let DMA driver provide a config2 value

On PCscsi, the FENAB configuration also enables 24-bit DMA transfer lengths
(and provides the chip id in TCHI after reset).  We want this to be available
as a module parameter.

Check if the caller of scsi_esp_register provided a value for esp-config2.
If this is the case, assume this is not an ESP100, skip the detection
phase and leave esp-config2 untouched.  It will be used in esp_reset_esp.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com

diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c
index 55548dc5cec3..123edcf439c0 100644
--- a/drivers/scsi/esp_scsi.c
+++ b/drivers/scsi/esp_scsi.c
@@ -2149,47 +2149,50 @@ static void esp_get_revision(struct esp *esp)
u8 val;
 
esp-config1 = (ESP_CONFIG1_PENABLE | (esp-scsi_id  7));
-   esp-config2 = (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY);
+   if (esp-config2 == 0) {
+   esp-config2 = (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY);
+   esp_write8(esp-config2, ESP_CFG2);
+
+   val = esp_read8(ESP_CFG2);
+   val = ~ESP_CONFIG2_MAGIC;
+
+   esp-config2 = 0;
+   if (val != (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY)) {
+   /* If what we write to cfg2 does not come back, cfg2 is 
not
+* implemented, therefore this must be a plain esp100.
+*/
+   esp-rev = ESP100;
+   return;
+   }
+   }
+
+   esp_set_all_config3(esp, 5);
+   esp-prev_cfg3 = 5;
esp_write8(esp-config2, ESP_CFG2);
+   esp_write8(0, ESP_CFG3);
+   esp_write8(esp-prev_cfg3, ESP_CFG3);
 
-   val = esp_read8(ESP_CFG2);
-   val = ~ESP_CONFIG2_MAGIC;
-   if (val != (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY)) {
-   /* If what we write to cfg2 does not come back, cfg2 is not
-* implemented, therefore this must be a plain esp100.
+   val = esp_read8(ESP_CFG3);
+   if (val != 5) {
+   /* The cfg2 register is implemented, however
+* cfg3 is not, must be esp100a.
 */
-   esp-rev = ESP100;
+   esp-rev = ESP100A;
} else {
-   esp-config2 = 0;
-   esp_set_all_config3(esp, 5);
-   esp-prev_cfg3 = 5;
-   esp_write8(esp-config2, ESP_CFG2);
-   esp_write8(0, ESP_CFG3);
+   esp_set_all_config3(esp, 0);
+   esp-prev_cfg3 = 0;
esp_write8(esp-prev_cfg3, ESP_CFG3);
 
-   val = esp_read8(ESP_CFG3);
-   if (val != 5) {
-   /* The cfg2 register is implemented, however
-* cfg3 is not, must be esp100a.
-*/
-   esp-rev = ESP100A;
+   /* All of cfg{1,2,3} implemented, must be one of
+* the fas variants, figure out which one.
+*/
+   if (esp-cfact == 0 || esp-cfact  ESP_CCF_F5) {
+   esp-rev = FAST;
+   esp-sync_defp = SYNC_DEFP_FAST

Re: [PATCH 06/12] esp_scsi: use FIFO for command submission

2014-11-21 Thread Paolo Bonzini

 ESP_FLAG_QUICKIRQ_CHECK  0x0010
  #define ESP_FLAG_DISABLE_SYNC0x0020
 +#define ESP_FLAG_USE_FIFO0x0040
  
   u8  select_state;
  #define ESP_SELECT_NONE  0x00 /* Not selecting */
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V1] virtio_scsi: support multi hw queue of blk-mq

2014-11-17 Thread Paolo Bonzini



On 15/11/2014 04:47, Ming Lei wrote:
 Since virtio_scsi has supported multi virtqueue already,
 it is natural to map the virtque to hw-queue of blk-mq.
 
 Cc: Paolo Bonzini pbonz...@redhat.com
 Signed-off-by: Ming Lei ming@canonical.com
 ---
 V1:
   - support non-mq too
 
  drivers/scsi/virtio_scsi.c |   17 -
  1 file changed, 16 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index b83846f..d3af70e 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -561,6 +561,15 @@ static int virtscsi_queuecommand_single(struct Scsi_Host 
 *sh,
   return virtscsi_queuecommand(vscsi, vscsi-req_vqs[0], sc);
  }
  
 +static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
 +   struct scsi_cmnd *sc)
 +{
 + u32 tag = blk_mq_unique_tag(sc-request);
 + u16 hwq = blk_mq_unique_tag_to_hwq(tag);
 +
 + return vscsi-req_vqs[hwq];
 +}
 +
  static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi,
  struct virtio_scsi_target_state 
 *tgt)
  {
 @@ -604,7 +613,12 @@ static int virtscsi_queuecommand_multi(struct Scsi_Host 
 *sh,
   struct virtio_scsi *vscsi = shost_priv(sh);
   struct virtio_scsi_target_state *tgt =
   scsi_target(sc-device)-hostdata;
 - struct virtio_scsi_vq *req_vq = virtscsi_pick_vq(vscsi, tgt);
 + struct virtio_scsi_vq *req_vq;
 +
 + if (shost_use_blk_mq(sh))
 + req_vq = virtscsi_pick_vq_mq(vscsi, sc);
 + else
 + req_vq = virtscsi_pick_vq(vscsi, tgt);
  
   return virtscsi_queuecommand(vscsi, req_vq, sc);
  }
 @@ -983,6 +997,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
   shost-max_id = num_targets;
   shost-max_channel = 0;
   shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
 + shost-nr_hw_queues = num_queues;
  
   if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {
   host_prot = SHOST_DIF_TYPE1_PROTECTION | 
 SHOST_DIF_TYPE2_PROTECTION |
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] virtio_scsi: support multi hw queue of blk-mq

2014-11-11 Thread Paolo Bonzini



On 10/11/2014 17:05, Christoph Hellwig wrote:
 
  a) there is no multiath support for it, and we simply can't break existing
 setups that use.
  b) there is no support for I/O schedulers at all.  This might be okay
 for virtio-scsi, where in general you have a host scheduler, but for
 real hardware that still has to deal with more spinning rust than
 solid state it's a blocker.  Fortuntely Jens is working on that one.

It is sort-of okay for virtio-scsi (and virtio-blk).  Deadline provides
a good speedup over noop for virtual disks, even in the presence of a
host scheduler---though nobody ever benchmarked why.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] virtio_scsi: support multi hw queue of blk-mq

2014-11-10 Thread Paolo Bonzini

On 09/11/2014 17:57, Ming Lei wrote:
 Since virtio_scsi has supported multi virtqueue already,
 it is natural to map virtque to hw-queue of blk-mq.
 
 Cc: Paolo Bonzini pbonz...@redhat.com
 Signed-off-by: Ming Lei ming@canonical.com
 ---
  drivers/scsi/virtio_scsi.c |  154 
 
  1 file changed, 14 insertions(+), 140 deletions(-)

Nice. :)

FWIW, instead of force_blk_mq, it would be fine for me to make
virtio-scsi use a single queue if not using blk-mq.

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] sd: Disable discard_zeroes_data for UNMAP

2014-11-10 Thread Paolo Bonzini

On 07/11/2014 06:08, Martin K. Petersen wrote:
 The T10 SBC UNMAP command does not provide any hard guarantees that
 blocks will return zeroes on a subsequent READ. This is due to the fact
 that the device server is free to silently ignore all or parts of the
 request.
 
 The only way to ensure that a block consistently returns zeroes after
 being unmapped is to use WRITE SAME with the UNMAP bit set. Should the
 device be unable to unmap one or more blocks described by the command it
 is required to manually write zeroes to them.
 
 Until now we have preferred UNMAP over the WRITE SAME variants to
 accommodate thinly provisioned devices that predated the final SBC-3
 spec. This patch changes the heuristic so that we favor WRITE SAME(16)
 or (10) over UNMAP if these commands are marked as supported in the
 Logical Block Provisioning VPD page.
 
 The patch also disables discard_zeroes_data for devices operating in
 UNMAP mode.
 
 Signed-off-by: Martin K. Petersen martin.peter...@oracle.com
 ---
  drivers/scsi/sd.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
 index b041eca8955d..95bfb7bfbb9d 100644
 --- a/drivers/scsi/sd.c
 +++ b/drivers/scsi/sd.c
 @@ -656,7 +656,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, 
 unsigned int mode)
   unsigned int logical_block_size = sdkp-device-sector_size;
   unsigned int max_blocks = 0;
  
 - q-limits.discard_zeroes_data = sdkp-lbprz;
 + q-limits.discard_zeroes_data = 0;
   q-limits.discard_alignment = sdkp-unmap_alignment *
   logical_block_size;
   q-limits.discard_granularity =
 @@ -680,11 +680,13 @@ static void sd_config_discard(struct scsi_disk *sdkp, 
 unsigned int mode)
   case SD_LBP_WS16:
   max_blocks = min_not_zero(sdkp-max_ws_blocks,
 (u32)SD_MAX_WS16_BLOCKS);
 + q-limits.discard_zeroes_data = sdkp-lbprz;
   break;
  
   case SD_LBP_WS10:
   max_blocks = min_not_zero(sdkp-max_ws_blocks,
 (u32)SD_MAX_WS10_BLOCKS);
 + q-limits.discard_zeroes_data = sdkp-lbprz;
   break;
  
   case SD_LBP_ZERO:
 @@ -2622,12 +2624,12 @@ static void sd_read_block_limits(struct scsi_disk 
 *sdkp)
  
   } else {/* LBP VPD page tells us what to use */
  
 - if (sdkp-lbpu  sdkp-max_unmap_blocks)
 - sd_config_discard(sdkp, SD_LBP_UNMAP);
 - else if (sdkp-lbpws)
 + if (sdkp-lbpws)
   sd_config_discard(sdkp, SD_LBP_WS16);
   else if (sdkp-lbpws10)
   sd_config_discard(sdkp, SD_LBP_WS10);
 + else if (sdkp-lbpu  sdkp-max_unmap_blocks)
 + sd_config_discard(sdkp, SD_LBP_UNMAP);
   else
   sd_config_discard(sdkp, SD_LBP_DISABLE);
   }
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT

2014-10-09 Thread Paolo Bonzini

Il 09/10/2014 06:14, Nicholas A. Bellinger ha scritto:
 AFAICT from qemu code, the ioctl VHOST_SCSI_CLEAR_ENDPOINT is always
 called during shutdown in order to release the endpoint and drop this
 new configfs dependency.

As far as I can see, the only path leading to the ioctl is
vhost_scsi_set_status-vhost_scsi_stop.  That only happens if the guest
driver resets the device upon shutdown, or via vhost_scsi_unrealize as
you pointed out.  But unrealize() is only called when a device is
hot-unplugged.

It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
quit command, because no attempt is done to bring down the VM data
structures (or free memory, or close file descriptors) in case of a
fatal exit.  The kernel should do that for us.

Besides that...

 The question is, what happens when qemu crashes..?  Is there currently
 an assurance that VHOST_SCSI_CLEAR_ENDPOINT is called via the normal
 VirtioDeviceClass-unrealize() when qemu exits abnormally..?

... of course nothing is called if you SIGKILL QEMU.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT

2014-10-09 Thread Paolo Bonzini

Il 09/10/2014 10:49, Paolo Bonzini ha scritto:
 
 It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
 quit command, because no attempt is done to bring down the VM data
 structures (or free memory, or close file descriptors) in case of a
 fatal exit.  The kernel should do that for us.

... and in the case of vhost-scsi, doesn't it do that when
vhost_scsi_release calls vhost_scsi_clear_endpoint?

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-29 Thread Paolo Bonzini

Il 29/08/2014 08:08, Hannes Reinecke ha scritto:

 No.
 FAILED for any eh_abort_cmd() means that the TMF hasn't been sent.
 So the midlayer escalates to the next EH step.
 The command will only ever be re-issued once EH completes.

Then the answer to Hans's question is yes.  It is legal to call
-scsi_done() after the eh_abort handler returns FAILED.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-28 Thread Paolo Bonzini

Il 28/08/2014 14:26, Hans de Goede ha scritto:
  Then, blk_complete_request will do nothing because its call to
  blk_mark_rq_complete returns true.
  
  All this, of course, as long as -scsi_done is called _before_ eh_abort
  returns.
 What about calling scsi_done after eh_abort if eh_abort returned FAILED?

I invoke the fifth amendment. :)

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-28 Thread Paolo Bonzini

Il 28/08/2014 16:17, Hannes Reinecke ha scritto:

 As mentioned earlier, as soon as SCSI EH is invoked control
 is assumed to be transferred back to the SCSI midlayer.
 How the midlayer interprets any return value from the various eh_XX
 callbacks is immaterial to the LLDD.
 
 So even if the eh_abort returns FAILED control is still with the SCSI
 midlayer, so the earlier statements apply.
 IE the command will be short-circuited by the block layer anyway if
 -scsi_done() is called.

As I parsed it, the question is not whether the short-circuiting will
happen.  It's whether you will have use-after-free bugs or not if you
call -scsi_done() after eh_abort returns FAILED.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-28 Thread Paolo Bonzini

Il 28/08/2014 17:50, Elliott, Robert (Server Storage) ha scritto:
 Is the block layer prevented from issuing a new command with the
 same tag before the error handling is finished?

Tags are chosen by the LLDs, so it's up to it to pick the right tags.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-25 Thread Paolo Bonzini

Il 23/08/2014 16:52, Hans de Goede ha scritto:
 Hi All,
 
 Now that the UAS driver is no longer marked as CONFIG_BROKEN,
 I'm getting quite a few bug reports about issues with UAS drives.
 
 One if the issues is that there might be a number of bugs in the
 abort handling path, as I don't think that was ever tested properly.
 
 So I'm wondering is there a way to test the abort path with a real
 drive? E.G. submit some command which is known to take a significant
 amount of time, and then abort it right after submitting ?

You could have some luck with QEMU's UAS emulation.  If you set QEMU's
I/O throttling options low enough (e.g. 100KB/sec), and then use dd with
iflag=direct and a largish block size (a few MBs), the guest should
abort its I/O soon enough.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Debugging scsi abort handling ?

2014-08-25 Thread Paolo Bonzini

Il 25/08/2014 12:28, Bart Van Assche ha scritto:
 
 From SPC-4: 7.5.8 Control mode page [ ... ] A task aborted status (TAS)
 bit set to zero specifies that aborted commands shall be terminated by
 the device server without any response to the application client. A TAS
 bit set to one specifies that commands aborted by the actions of an I_T
 nexus other than the I_T nexus on which the command was received shall
 be completed with TASK ABORTED status (see SAM-5).

Note the aborted by the actions of an I_T nexus other than the I_T
nexus on which the command was received.

In practice, this means that TASK ABORTED should only happen if you use
the CLEAR TASK SET tmf and TST is not set to 001b (i.e. _not_ to per
I_T nexus) in the Control mode page.  It should never happen for a pen
drive.

Setting TASK ABORTED aside, the important part is that an abort can do
one of two things:

- complete the command, and then eh_abort should return after the driver
has noticed the completion and called the -scsi_done callback for the
Scsi_Cmnd*.

- abort the command, and then the driver should never call the
-scsi_done callback for the Scsi_Cmnd*.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] virtio_scsi: check on resp-sense_len instead of 'sense_buffer'

2014-07-18 Thread Paolo Bonzini

Il 18/07/2014 16:57, Ming Lei ha scritto:
 - if (sc-sense_buffer) {
 + if (resp-sense_len) {

In the (unlikely) case that sc-sense_buffer == NULL, you'd pass a NULL
to memcpy.

If you want, you can change this if to

if (sc-sense_buffer  resp-sense_len)

but frankly it seems like slightly pointless churn to me.

Paolo

   memcpy(sc-sense_buffer, resp-sense,
  min_t(u32, resp-sense_len, VIRTIO_SCSI_SENSE_SIZE));
 - if (resp-sense_len)
 - set_driver_byte(sc, DRIVER_SENSE);
 + set_driver_byte(sc, DRIVER_SENSE);
   }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/2] virtio-scsi queue for 3.17

2014-07-06 Thread Paolo Bonzini

Christoph asked me to rebase the two virtio-scsi patches from
http://thread.gmane.org/gmane.linux.kernel/1717796 that do not apply
anymore, patch 2 and 6.

These are on top of his drivers-for-3.16 branch, more precisely commit
8faeb529b2da (virtio-scsi: fix various bad behavior on aborted requests).

Paolo

Ming Lei (1):
  virtio-scsi: replace target spinlock with seqcount

Venkatesh Srinivas (1):
  virtio-scsi: Implement change_queue_depth for virtscsi targets

 drivers/scsi/virtio_scsi.c | 75 ++
 1 file changed, 62 insertions(+), 13 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] virtio-scsi: replace target spinlock with seqcount

2014-07-06 Thread Paolo Bonzini

From: Ming Lei ming@canonical.com

The spinlock of tgt_lock is only for serializing read and write
req_vq, one lockless seqcount is enough for the purpose.

On one 16core VM with vhost-scsi backend, the patch can improve
IOPS with 3% on random read test.

Signed-off-by: Ming Lei ming@canonical.com
[Add initialization in virtscsi_target_alloc. - Paolo]
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 drivers/scsi/virtio_scsi.c | 42 +-
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 308256b5e4cb..f67a872de14b 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -27,6 +27,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_device.h
 #include scsi/scsi_cmnd.h
+#include linux/seqlock.h
 
 #define VIRTIO_SCSI_MEMPOOL_SZ 64
 #define VIRTIO_SCSI_EVENT_LEN 8
@@ -75,18 +76,16 @@ struct virtio_scsi_vq {
  * queue, and also lets the driver optimize the IRQ affinity for the virtqueues
  * (each virtqueue's affinity is set to the CPU that owns the queue).
  *
- * tgt_lock is held to serialize reading and writing req_vq. Reading req_vq
- * could be done locklessly, but we do not do it yet.
+ * tgt_seq is held to serialize reading and writing req_vq.
  *
  * Decrements of reqs are never concurrent with writes of req_vq: before the
  * decrement reqs will be != 0; after the decrement the virtqueue completion
  * routine will not use the req_vq so it can be changed by a new request.
- * Thus they can happen outside the tgt_lock, provided of course we make reqs
+ * Thus they can happen outside the tgt_seq, provided of course we make reqs
  * an atomic_t.
  */
 struct virtio_scsi_target_state {
-   /* This spinlock never held at the same time as vq_lock. */
-   spinlock_t tgt_lock;
+   seqcount_t tgt_seq;
 
/* Count of outstanding requests. */
atomic_t reqs;
@@ -559,19 +558,33 @@ static struct virtio_scsi_vq *virtscsi_pick_vq(struct 
virtio_scsi *vscsi,
unsigned long flags;
u32 queue_num;
 
-   spin_lock_irqsave(tgt-tgt_lock, flags);
+   local_irq_save(flags);
+   if (atomic_inc_return(tgt-reqs)  1) {
+   unsigned long seq;
+
+   do {
+   seq = read_seqcount_begin(tgt-tgt_seq);
+   vq = tgt-req_vq;
+   } while (read_seqcount_retry(tgt-tgt_seq, seq));
+   } else {
+   /* no writes can be concurrent because of atomic_t */
+   write_seqcount_begin(tgt-tgt_seq);
+
+   /* keep previous req_vq if a reader just arrived */
+   if (unlikely(atomic_read(tgt-reqs)  1)) {
+   vq = tgt-req_vq;
+   goto unlock;
+   }
 
-   if (atomic_inc_return(tgt-reqs)  1)
-   vq = tgt-req_vq;
-   else {
queue_num = smp_processor_id();
while (unlikely(queue_num = vscsi-num_queues))
queue_num -= vscsi-num_queues;
-
tgt-req_vq = vq = vscsi-req_vqs[queue_num];
+ unlock:
+   write_seqcount_end(tgt-tgt_seq);
}
+   local_irq_restore(flags);
 
-   spin_unlock_irqrestore(tgt-tgt_lock, flags);
return vq;
 }
 
@@ -667,14 +680,17 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
 
 static int virtscsi_target_alloc(struct scsi_target *starget)
 {
+   struct Scsi_Host *sh = dev_to_shost(starget-dev.parent);
+   struct virtio_scsi *vscsi = shost_priv(sh);
+
struct virtio_scsi_target_state *tgt =
kmalloc(sizeof(*tgt), GFP_KERNEL);
if (!tgt)
return -ENOMEM;
 
-   spin_lock_init(tgt-tgt_lock);
+   seqcount_init(tgt-tgt_seq);
atomic_set(tgt-reqs, 0);
-   tgt-req_vq = NULL;
+   tgt-req_vq = vscsi-req_vqs[0];
 
starget-hostdata = tgt;
return 0;
-- 
1.8.3.1


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/5] scsi: remove various exports that were only used by scsi_tgt

2014-07-04 Thread Paolo Bonzini

 *);
 extern void scsi_finish_command(struct scsi_cmnd *cmd);

 extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count,
 size_t *offset, size_t *len);
 extern void scsi_kunmap_atomic_sg(void *virt);

-extern int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask);
-extern void scsi_release_buffers(struct scsi_cmnd *cmd);
-
 extern int scsi_dma_map(struct scsi_cmnd *cmd);
 extern void scsi_dma_unmap(struct scsi_cmnd *cmd);

diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 28336c6..4c38c16 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -795,8 +795,6 @@ extern struct Scsi_Host *scsi_host_lookup(unsigned short);
 extern const char *scsi_host_state_name(enum scsi_host_state);
 extern void scsi_cmd_get_serial(struct Scsi_Host *, struct scsi_cmnd *);

-extern u64 scsi_calculate_bounce_limit(struct Scsi_Host *);
-
 static inline int __must_check scsi_add_host(struct Scsi_Host *host,
 struct device *dev)
 {



Reviewed-by: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tgt infrastructure removal

2014-07-04 Thread Paolo Bonzini


Il 04/07/2014 11:36, Christoph Hellwig ha scritto:

I got an older Ack from Tomo for the removal, and some sort of acks
for the ibmvscsi target removal from Brian, Paolo and Nathan.

Can I get some formal reviews for the code removal patches (1-4) so I
can queue this up for 3.17?


Not that there's much to review there. :)

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/5] ibmvstgt: remove

2014-07-01 Thread Paolo Bonzini


Il 24/06/2014 16:28, Christoph Hellwig ha scritto:

 Adding Paul and Nathan to cc here. I'm pretty sure the backend for ibmvscsi in
 KVM was all done in qemu and there is no dependency on ibmvstgt.



FWIW the ibmvscsi backend is indeed entirely in userspace 
(hw/scsi/spapr_vscsi.c in the QEMU tree).


Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 >

1 - 100 of 378 matches

Mail list logo