Re: [PATCH V2 6/8] scsi: virtio_scsi: fix IO hang by irq vector automatic affinity
On 05/02/2018 16:20, Ming Lei wrote: > Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs) > has been merged to V4.16-rc, and it is easy to allocate all offline CPUs > for some irq vectors, this can't be avoided even though the allocation > is improved. > > For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of > virtio-scsi, the irq affinity assigned can become the following shape: > > irq 36, cpu list 0-7 > irq 37, cpu list 0-7 > irq 38, cpu list 0-7 > irq 39, cpu list 0-1 > irq 40, cpu list 4,6 > irq 41, cpu list 2-3 > irq 42, cpu list 5,7 > > Then IO hang is triggered in case of non-SCSI_MQ. > > Given storage IO is always C/S model, there isn't such issue with > SCSI_MQ(blk-mq), > because no IO can be submitted to one hw queue if the hw queue hasn't online > CPUs. > > Fix this issue by forcing to use blk_mq. > > BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has > been quite stable, so it shouldn't cause extra risk. I think that's ok now that we have I/O schedulers for blk-mq. Acked-by: Paolo Bonzini <pbonz...@redhat.com> Paolo > Cc: Arun Easi <arun.e...@cavium.com> > Cc: Omar Sandoval <osan...@fb.com>, > Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, > Cc: James Bottomley <james.bottom...@hansenpartnership.com>, > Cc: Christoph Hellwig <h...@lst.de>, > Cc: Don Brace <don.br...@microsemi.com> > Cc: Kashyap Desai <kashyap.de...@broadcom.com> > Cc: Peter Rivera <peter.riv...@broadcom.com> > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: Mike Snitzer <snit...@redhat.com> > Reviewed-by: Hannes Reinecke <h...@suse.de> > Tested-by: Laurence Oberman <lober...@redhat.com> > Signed-off-by: Ming Lei <ming@redhat.com> > --- > drivers/scsi/virtio_scsi.c | 59 > +++--- > 1 file changed, 3 insertions(+), 56 deletions(-) > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 7c28e8d4955a..54e3a0f6844c 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -91,9 +91,6 @@ struct virtio_scsi_vq { > struct virtio_scsi_target_state { > seqcount_t tgt_seq; > > - /* Count of outstanding requests. */ > - atomic_t reqs; > - > /* Currently active virtqueue for requests sent to this target. */ > struct virtio_scsi_vq *req_vq; > }; > @@ -152,8 +149,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi > *vscsi, void *buf) > struct virtio_scsi_cmd *cmd = buf; > struct scsi_cmnd *sc = cmd->sc; > struct virtio_scsi_cmd_resp *resp = >resp.cmd; > - struct virtio_scsi_target_state *tgt = > - scsi_target(sc->device)->hostdata; > > dev_dbg(>device->sdev_gendev, > "cmd %p response %u status %#02x sense_len %u\n", > @@ -210,8 +205,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi > *vscsi, void *buf) > } > > sc->scsi_done(sc); > - > - atomic_dec(>reqs); > } > > static void virtscsi_vq_done(struct virtio_scsi *vscsi, > @@ -580,10 +573,7 @@ static int virtscsi_queuecommand_single(struct Scsi_Host > *sh, > struct scsi_cmnd *sc) > { > struct virtio_scsi *vscsi = shost_priv(sh); > - struct virtio_scsi_target_state *tgt = > - scsi_target(sc->device)->hostdata; > > - atomic_inc(>reqs); > return virtscsi_queuecommand(vscsi, >req_vqs[0], sc); > } > > @@ -596,55 +586,11 @@ static struct virtio_scsi_vq > *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi, > return >req_vqs[hwq]; > } > > -static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi, > -struct virtio_scsi_target_state > *tgt) > -{ > - struct virtio_scsi_vq *vq; > - unsigned long flags; > - u32 queue_num; > - > - local_irq_save(flags); > - if (atomic_inc_return(>reqs) > 1) { > - unsigned long seq; > - > - do { > - seq = read_seqcount_begin(>tgt_seq); > - vq = tgt->req_vq; > - } while (read_seqcount_retry(>tgt_seq, seq)); > - } else { > - /* no writes can be concurrent because of atomic_t */ > - write_seqcount_begin(>tgt_seq); > - > - /* keep previous req_vq if a reader just arrived */ > - if (unlikely(atomic_read(>reqs) > 1))
Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.
On 11/08/2017 19:23, Michael S. Tsirkin wrote: > On Fri, Aug 11, 2017 at 04:09:26PM +0200, Paolo Bonzini wrote: >> On 10/08/2017 23:41, Michael S. Tsirkin wrote: >>>>> Then we probably should fail probe if vq size is too small. >>>> What does this mean? >>> >>> We must prevent driver from submitting s/g lists > vq size to device. >> >> What is the rationale for the limit? > > So the host knows what it needs to support. > >> both virtio-blk and virtio-scsi transmit their own value for the >> maximum sg list size (max_seg in virtio-scsi, seg_max in virtio-blk). > > No other device has it, and it seemed like a good idea to > limit it generally at the time. > > we can fix the spec to relax the requirement for blk and scsi - > want to submit a proposal? Alternatively, add a generic field > for that. Yes, I can submit a proposal. blk and scsi are the ones that are most likely to have very long sg lists. When I was designing scsi I just copied that field from blk. :) Paolo
Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.
On 10/08/2017 23:41, Michael S. Tsirkin wrote: >>> Then we probably should fail probe if vq size is too small. >> What does this mean? > > We must prevent driver from submitting s/g lists > vq size to device. What is the rationale for the limit? It makes no sense if indirect descriptors are available, especially because... > Either tell linux to avoid s/g lists that are too long, or > simply fail request if this happens, or refuse to attach driver to device. > > Later option would look something like this within probe: > > for (i = VIRTIO_SCSI_VQ_BASE; i < num_vqs; i++) > if (vqs[i]->num < MAX_SG_USED_BY_LINUX) > goto err; > > > I don't know what's MAX_SG_USED_BY_LINUX though. > ... both virtio-blk and virtio-scsi transmit their own value for the maximum sg list size (max_seg in virtio-scsi, seg_max in virtio-blk). Paolo
Re: [PATCH v2 2/2] virtio: virtio_scsi: Set can_queue to the length of the virtqueue.
On 10/08/2017 18:56, Richard W.M. Jones wrote: > Since switching to blk-mq as the default in commit 5c279bd9e406 > ("scsi: default to scsi-mq"), virtio-scsi LUNs consume about 10x as > much kernel memory. > > qemu currently allocates a fixed 128 entry virtqueue. can_queue > currently is set to 1024. But with indirect descriptors, each command > in the queue takes 1 virtqueue entry, so the number of commands which > can be queued is equal to the length of the virtqueue. > > Note I intend to send a patch to qemu to allow the virtqueue size to > be configured from the qemu command line. > > Thanks Paolo Bonzini, Christoph Hellwig. > > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> > --- > drivers/scsi/virtio_scsi.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 9be211d68b15..7c28e8d4955a 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -818,7 +818,6 @@ static struct scsi_host_template > virtscsi_host_template_single = { > .eh_timed_out = virtscsi_eh_timed_out, > .slave_alloc = virtscsi_device_alloc, > > - .can_queue = 1024, > .dma_boundary = UINT_MAX, > .use_clustering = ENABLE_CLUSTERING, > .target_alloc = virtscsi_target_alloc, > @@ -839,7 +838,6 @@ static struct scsi_host_template > virtscsi_host_template_multi = { > .eh_timed_out = virtscsi_eh_timed_out, > .slave_alloc = virtscsi_device_alloc, > > - .can_queue = 1024, > .dma_boundary = UINT_MAX, > .use_clustering = ENABLE_CLUSTERING, > .target_alloc = virtscsi_target_alloc, > @@ -972,6 +970,8 @@ static int virtscsi_probe(struct virtio_device *vdev) > if (err) > goto virtscsi_init_failed; > > + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq); > + > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x; > Acked-by: Paolo Bonzini <pbonz...@redhat.com>
Re: [PATCH 2/2] virtio: virtio_scsi: Set can_queue to the length of the virtqueue.
On 10/08/2017 18:40, Richard W.M. Jones wrote: > Since switching to blk-mq as the default in commit 5c279bd9e406 > ("scsi: default to scsi-mq"), virtio-scsi LUNs consume about 10x as > much kernel memory. > > qemu currently allocates a fixed 128 entry virtqueue. can_queue > currently is set to 1024. But with indirect descriptors, each command > in the queue takes 1 virtqueue entry, so the number of commands which > can be queued is equal to the length of the virtqueue. > > Note I intend to send a patch to qemu to allow the virtqueue size to > be configured from the qemu command line. You can clear .can_queue from the templates. Otherwise looks good. Paolo > Thanks Paolo Bonzini, Christoph Hellwig. > > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> > --- > drivers/scsi/virtio_scsi.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 9be211d68b15..d6b4ff634c0d 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev) > if (err) > goto virtscsi_init_failed; > > + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq); > + > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x; >
Re: [PATCH 1/2] virtio: Reduce BUG if total_sg > virtqueue size to WARN.
On 10/08/2017 18:40, Richard W.M. Jones wrote: > If using indirect descriptors, you can make the total_sg as large as > you want. If not, BUG is too serious because the function later > returns -ENOSPC. > > Thanks Paolo Bonzini, Christoph Hellwig. > > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> > --- > drivers/virtio/virtio_ring.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 5e1b548828e6..27cbc1eab868 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq, > } > #endif > > - BUG_ON(total_sg > vq->vring.num); > BUG_ON(total_sg == 0); > > head = vq->free_head; > @@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq, >* buffers, then go indirect. FIXME: tune this threshold */ > if (vq->indirect && total_sg > 1 && vq->vq.num_free) > desc = alloc_indirect(_vq, total_sg, gfp); > - else > + else { > desc = NULL; > + WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect); So we get here only if vq->vq.num_free is zero. In that case, vq->indirect makes no difference for the appropriateness of the warning (that is, it should never be issued for indirect descriptors). > + } > > if (desc) { > /* Use a single buffer which doesn't continue */ > Reviewed-by: Paolo Bonzini <pbonz...@redhat.com> Paolo
Re: Increased memory usage with scsi-mq
On 10/08/2017 17:40, Richard W.M. Jones wrote: > OK this is looking a bit better now. > > With scsi-mq enabled: 175 disks > virtqueue_size=64: 318 disks * > virtqueue_size=16: 775 disks * > With scsi-mq disabled: 1755 disks > * = new results > > I also ran the whole libguestfs test suite with virtqueue_size=16 > (with no failures shown). As this tests many different disk I/O > operations, it gives me some confidence that things generally work. Yes, it looks good. I'll grab you on IRC to discuss the commit message. Paolo > Do you have any other comments about the patches? I'm not sure I know > enough to write an intelligent commit message for the kernel patch. > > Rich. > > --- kernel patch --- > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 9be211d68b15..d6b4ff634c0d 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev) > if (err) > goto virtscsi_init_failed; > > + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq); > + > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x; > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 5e1b548828e6..2d7509da9f39 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq, > } > #endif > > - BUG_ON(total_sg > vq->vring.num); > BUG_ON(total_sg == 0); > > head = vq->free_head; > @@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq, >* buffers, then go indirect. FIXME: tune this threshold */ > if (vq->indirect && total_sg > 1 && vq->vq.num_free) > desc = alloc_indirect(_vq, total_sg, gfp); > - else > + else { > desc = NULL; > + WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect); > +} > > if (desc) { > /* Use a single buffer which doesn't continue */ > > --- qemu patch --- > > diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c > index eb639442d1..aadd99aad1 100644 > --- a/hw/scsi/virtio-scsi.c > +++ b/hw/scsi/virtio-scsi.c > @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev, > s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE; > s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE; > > -s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl); > -s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt); > +s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl); > +s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt); > for (i = 0; i < s->conf.num_queues; i++) { > -s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd); > +s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd); > } > } > > @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState > *dev, Error **errp) > > static Property virtio_scsi_properties[] = { > DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, > 1), > +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, > parent_obj.conf.virtqueue_size, 128), > DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, > parent_obj.conf.max_sectors, >0x), > DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, > parent_obj.conf.cmd_per_lun, > diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h > index de6ae5a9f6..e30a92d3e7 100644 > --- a/include/hw/virtio/virtio-scsi.h > +++ b/include/hw/virtio/virtio-scsi.h > @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig; > > struct VirtIOSCSIConf { > uint32_t num_queues; > +uint32_t virtqueue_size; > uint32_t max_sectors; > uint32_t cmd_per_lun; > #ifdef CONFIG_VHOST_SCSI > > >
Re: Increased memory usage with scsi-mq
On 10/08/2017 16:16, Richard W.M. Jones wrote: > On Thu, Aug 10, 2017 at 02:53:58PM +0200, Paolo Bonzini wrote: >> can_queue and cmd_per_lun are different. can_queue should be set to the >> value of vq->vring.num where vq is the command virtqueue (the first one >> is okay if there's >1). >> >> If you want to change it, you'll have to do so in QEMU. > > Here are a couple more patches I came up with, the first to Linux, > the second to qemu. > > There are a few problems ... > > (1) Setting virtqueue_size to < 128 causes a BUG_ON failure in > virtio_ring.c:virtqueue_add at: > > BUG_ON(total_sg > vq->vring.num); That bug is bogus. You can change it to WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect); and move it in the "else" here: if (vq->indirect && total_sg > 1 && vq->vq.num_free) desc = alloc_indirect(_vq, total_sg, gfp); else desc = NULL; You just get a -ENOSPC after the WARN, so no need to crash the kernel! > (2) Can/should the ctrl, event and cmd queue sizes be set to the same > values? Or should ctrl & event be left at 128? It's okay if they're all the same. > (3) It seems like it might be a problem that virtqueue_size is not > passed through the virtio_scsi_conf struct (which is ABI between the > kernel and qemu AFAICT and so not easily modifiable). However I think > this is not a problem because virtqueue size is stored in the > Virtqueue Descriptor table, which is how the kernel gets the value. > Is that right? Yes. Paolo > Rich. > > > --- kernel patch --- > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 9be211d68b15..d6b4ff634c0d 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev) > if (err) > goto virtscsi_init_failed; > > + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq); > + > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x; > > > --- qemu patch --- > > diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c > index eb639442d1..aadd99aad1 100644 > --- a/hw/scsi/virtio-scsi.c > +++ b/hw/scsi/virtio-scsi.c > @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev, > s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE; > s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE; > > -s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl); > -s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt); > +s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl); > +s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt); > for (i = 0; i < s->conf.num_queues; i++) { > -s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd); > +s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd); > } > } > > @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState > *dev, Error **errp) > > static Property virtio_scsi_properties[] = { > DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, > 1), > +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, > parent_obj.conf.virtqueue_size, 128), > DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, > parent_obj.conf.max_sectors, >0x), > DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, > parent_obj.conf.cmd_per_lun, > diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h > index de6ae5a9f6..e30a92d3e7 100644 > --- a/include/hw/virtio/virtio-scsi.h > +++ b/include/hw/virtio/virtio-scsi.h > @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig; > > struct VirtIOSCSIConf { > uint32_t num_queues; > +uint32_t virtqueue_size; > uint32_t max_sectors; > uint32_t cmd_per_lun; > #ifdef CONFIG_VHOST_SCSI > >
Re: Increased memory usage with scsi-mq
On 10/08/2017 14:22, Richard W.M. Jones wrote: > On Wed, Aug 09, 2017 at 06:50:10PM +0200, Paolo Bonzini wrote: >> On 09/08/2017 18:01, Christoph Hellwig wrote: >>> On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote: >>>> can_queue should depend on the virtqueue size, which unfortunately can >>>> vary for each virtio-scsi device in theory. The virtqueue size is >>>> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and >>>> in QEMU it is the second argument to virtio_add_queue. >>> >>> Why is that unfortunate? We don't even have to set can_queue in >>> the host template, we can dynamically set it on per-host basis. >> >> Ah, cool, I thought allocations based on can_queue happened already in >> scsi_host_alloc, but they happen at scsi_add_host time. > > I think I've decoded all that information into the patch below. > > I tested it, and it appears to work: when I set cmd_per_lun on the > qemu command line, I see that the guest can add more disks: > > With scsi-mq enabled: 175 disks > cmd_per_lun not set:177 disks * > cmd_per_lun=16: 776 disks * > cmd_per_lun=4: 1160 disks * > With scsi-mq disabled: 1755 disks > * = new result > > From my point of view, this is a good result, but you should be warned > that I don't fully understand what's going on here and I may have made > obvious or not-so-obvious mistakes. can_queue and cmd_per_lun are different. can_queue should be set to the value of vq->vring.num where vq is the command virtqueue (the first one is okay if there's >1). If you want to change it, you'll have to do so in QEMU. Paolo > I tested the performance impact and it's not noticable in the > libguestfs case even with very small cmd_per_lun settings, but > libguestfs is largely serial and so this result won't be applicable to > guests in general. > > Also, should the guest kernel validate cmd_per_lun to make sure it's > not too small or large? And if so, what would the limits be? > > Rich. > > From e923e49846189b2f55f3f02b70a290d4be237ed5 Mon Sep 17 00:00:00 2001 > From: "Richard W.M. Jones" <rjo...@redhat.com> > Date: Thu, 10 Aug 2017 12:21:47 +0100 > Subject: [PATCH] scsi: virtio_scsi: Set can_queue based on cmd_per_lun passed > by hypervisor. > > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> > --- > drivers/scsi/virtio_scsi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 9be211d68b15..b22591e9b16b 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -973,7 +973,7 @@ static int virtscsi_probe(struct virtio_device *vdev) > goto virtscsi_init_failed; > > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > - shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > + shost->cmd_per_lun = shost->can_queue = cmd_per_lun; > shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0x; > > /* LUNs > 256 are reported with format 1, so they go in the range >
Re: Increased memory usage with scsi-mq
On 09/08/2017 18:01, Christoph Hellwig wrote: > On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote: >> can_queue should depend on the virtqueue size, which unfortunately can >> vary for each virtio-scsi device in theory. The virtqueue size is >> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and >> in QEMU it is the second argument to virtio_add_queue. > > Why is that unfortunate? We don't even have to set can_queue in > the host template, we can dynamically set it on per-host basis. Ah, cool, I thought allocations based on can_queue happened already in scsi_host_alloc, but they happen at scsi_add_host time. Paolo
Re: Increased memory usage with scsi-mq
On 07/08/2017 14:27, Richard W.M. Jones wrote: > Also I noticed this code in virtio_scsi.c: > > cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; > shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); > > but setting cmd_per_lun (as a qemu virtio-scsi-pci parameter) didn't > seem to make any difference to memory usage. That's because memory depends on shost->can_queue, not shost->cmd_per_lun (see scsi_mq_setup_tags). can_queue should depend on the virtqueue size, which unfortunately can vary for each virtio-scsi device in theory. The virtqueue size is retrieved by drivers/virtio prior to calling vring_create_virtqueue, and in QEMU it is the second argument to virtio_add_queue. For virtio-scsi this is VIRTIO_SCSI_VQ_SIZE, which is 128, so changing can_queue to 128 should be a no brainer. Thanks, Paolo
Re: Increased memory usage with scsi-mq
On 05/08/2017 17:51, Richard W.M. Jones wrote: > On Sat, Aug 05, 2017 at 03:39:54PM +0200, Christoph Hellwig wrote: >> For now can you apply this testing patch to the guest kernel? >> >> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c >> index 9be211d68b15..0cbe2c882e1c 100644 >> --- a/drivers/scsi/virtio_scsi.c >> +++ b/drivers/scsi/virtio_scsi.c >> @@ -818,7 +818,7 @@ static struct scsi_host_template >> virtscsi_host_template_single = { >> .eh_timed_out = virtscsi_eh_timed_out, >> .slave_alloc = virtscsi_device_alloc, >> >> -.can_queue = 1024, >> +.can_queue = 64, >> .dma_boundary = UINT_MAX, >> .use_clustering = ENABLE_CLUSTERING, >> .target_alloc = virtscsi_target_alloc, >> @@ -839,7 +839,7 @@ static struct scsi_host_template >> virtscsi_host_template_multi = { >> .eh_timed_out = virtscsi_eh_timed_out, >> .slave_alloc = virtscsi_device_alloc, >> >> -.can_queue = 1024, >> +.can_queue = 64, >> .dma_boundary = UINT_MAX, >> .use_clustering = ENABLE_CLUSTERING, >> .target_alloc = virtscsi_target_alloc, >> @@ -983,7 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev) >> shost->max_id = num_targets; >> shost->max_channel = 0; >> shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE; >> -shost->nr_hw_queues = num_queues; >> +shost->nr_hw_queues = 1; >> >> #ifdef CONFIG_BLK_DEV_INTEGRITY >> if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) { > > Yes, that's an improvement, although it's still a little way off the > density possible the old way: > > With scsi-mq enabled: 175 disks > * With this patch:319 disks * > With scsi-mq disabled: 1755 disks > > Also only the first two hunks are necessary. The kernel behaves > exactly the same way with or without the third hunk (ie. num_queues > must already be 1). > > Can I infer from this that qemu needs a way to specify the can_queue > setting to the virtio-scsi driver in the guest kernel? You could also add a module parameter to the driver, and set it to 64 on the kernel command line (there is an example in drivers/scsi/vmw_pvscsi.c of how to do it). Paolo
[PATCH] virtio_scsi: always read VPD pages for multiqueue too
Multi-queue virtio-scsi uses a different scsi_host_template struct. Add the .device_alloc field there, too. Fixes: 25d1d50e23275e141e3a3fe06c25a99f4c4bf4e0 Cc: sta...@vger.kernel.org Cc: David Gibson <da...@gibson.dropbear.id.au> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- drivers/scsi/virtio_scsi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index f8dbfeee6c63..ad1e7f1aba4c 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -826,6 +826,7 @@ static int virtscsi_map_queues(struct Scsi_Host *shost) .change_queue_depth = virtscsi_change_queue_depth, .eh_abort_handler = virtscsi_abort, .eh_device_reset_handler = virtscsi_device_reset, + .slave_alloc = virtscsi_device_alloc, .can_queue = 1024, .dma_boundary = UINT_MAX, -- 1.8.3.1
[PATCH] virtio_scsi: let host do exception handling
virtio_scsi tries to do exception handling after the default 30 seconds timeout expires. However, it's better to let the host control the timeout, otherwise with a heavy I/O load it is likely that an abort will also timeout. This leads to fatal errors like filesystems going offline. Disable the 'sd' timeout and allow the host to do exception handling, following the precedent of the storvsc driver. Hannes has a proposal to introduce timeouts in virtio, but this provides an immediate solution for stable kernels too. Reported-by: Douglas Miller <dougm...@linux.vnet.ibm.com> Cc: "James E.J. Bottomley" <j...@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.peter...@oracle.com> Cc: Hannes Reinecke <h...@suse.de> Cc: linux-scsi@vger.kernel.org Cc: sta...@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- drivers/scsi/virtio_scsi.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index f8dbfeee6c63..55d344ea6869 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -796,6 +796,16 @@ static int virtscsi_map_queues(struct Scsi_Host *shost) return blk_mq_virtio_map_queues(>tag_set, vscsi->vdev, 2); } +/* + * The host guarantees to respond to each command, although I/O latencies might + * be higher than on bare meta. Reset the timer unconditionally to give the + * host a chance to perform EH. + */ +static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd) +{ + return BLK_EH_RESET_TIMER; +} + static struct scsi_host_template virtscsi_host_template_single = { .module = THIS_MODULE, .name = "Virtio SCSI HBA", @@ -806,6 +816,7 @@ static struct scsi_host_template virtscsi_host_template_single = { .change_queue_depth = virtscsi_change_queue_depth, .eh_abort_handler = virtscsi_abort, .eh_device_reset_handler = virtscsi_device_reset, + .eh_timed_out = virtscsi_eh_timed_out, .slave_alloc = virtscsi_device_alloc, .can_queue = 1024, @@ -826,6 +837,7 @@ static struct scsi_host_template virtscsi_host_template_multi = { .change_queue_depth = virtscsi_change_queue_depth, .eh_abort_handler = virtscsi_abort, .eh_device_reset_handler = virtscsi_device_reset, + .eh_timed_out = virtscsi_eh_timed_out, .can_queue = 1024, .dma_boundary = UINT_MAX, -- 2.13.0
Re: [PATCH] virtio_scsi: Always try to read VPD pages
On 13/04/2017 15:39, Stefan Hajnoczi wrote: > On Thu, Apr 13, 2017 at 12:13:00PM +1000, David Gibson wrote: >> @@ -705,6 +706,28 @@ static int virtscsi_device_reset(struct scsi_cmnd *sc) >> return virtscsi_tmf(vscsi, cmd); >> } >> >> +static int virtscsi_device_alloc(struct scsi_device *sdevice) >> +{ >> +/* >> + * Passed through SCSI targets (e.g. with qemu's 'scsi-block') >> + * may have transfer limits which come from the host SCSI >> + * controller something on the host side other than the target > > s/controller something/controller or something/ ? Acked-by: Paolo Bonzini <pbonz...@redhat.com> with this change. Paolo
Re: [PATCH 26/27] scsi: sd: Separate zeroout and discard command choices
On 05/04/2017 19:21, Christoph Hellwig wrote: > +static ssize_t > +zeroing_mode_store(struct device *dev, struct device_attribute *attr, > +const char *buf, size_t count) > +{ > + struct scsi_disk *sdkp = to_scsi_disk(dev); > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EACCES; > + > + > + if (!strncmp(buf, zeroing_mode[SD_ZERO_WRITE], 20)) > + sdkp->zeroing_mode = SD_ZERO_WRITE; > + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS], 20)) > + sdkp->zeroing_mode = SD_ZERO_WS; > + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS16_UNMAP], 20)) > + sdkp->zeroing_mode = SD_ZERO_WS16_UNMAP; > + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS10_UNMAP], 20)) > + sdkp->zeroing_mode = SD_ZERO_WS10_UNMAP; Should this be conditional on lbprz, lbpws, lbpws10 and max_ws_blocks? Thanks, Paolo
Re: [PATCH 03/23] sd: implement REQ_OP_WRITE_ZEROES
On 29/03/2017 18:28, Bart Van Assche wrote: > On Wed, 2017-03-29 at 16:51 +0200, Paolo Bonzini wrote: >> On 28/03/2017 20:50, Bart Van Assche wrote: >>> This means that just like the start and end of a discard must be aligned on >>> a >>> discard_granularity boundary, WRITE SAME commands with the UNMAP bit set >>> must >>> also respect that granularity. I think this means that either >>> __blkdev_issue_zeroout() has to be modified such that it rejects unaligned >>> REQ_OP_WRITE_ZEROES operations or that blk_bio_write_same_split() has to be >>> modified such that it generates REQ_OP_WRITEs for the unaligned start and >>> tail. >> >> I don't think this is the case. > > Hello Paolo, > > Can you cite the section(s) from the SCSI specs that support your view? I > reread the "5.49 WRITE SAME (10) command" and "4.7.3.4.4 WRITE SAME command > and unmap operations" sections but I have not found any explicit statement > that specifies the behavior for unaligned WRITE SAME commands with the UNMAP > bit set. It seems to me like the OPTIMAL UNMAP GRANULARITY parameter was > overlooked when both sections were written. Should we ask the T10 committee > for a clarification? >From 4.7.3.4.4: -- If unmap operations are requested in a WRITE SAME command, then for each specified LBA: if the Data-Out Buffer of the WRITE SAME command is the same as the logical block data returned by a read operation from that LBA while in the unmapped state (see 4.7.4.5), then: 1) the device server performs the actions described in table 6; and 2) if an unmap operation is not performed in step 1), then the device server shall perform the specified write operation to that LBA; -- and from the description of WRITE SAME (10): "subsequent read operations behave as if the device server wrote the single block of user data received from the Data-Out Buffer to each logical block without modification" (I have a slightly older copy though, it's 5.45 here). It's pretty unambiguous that if the device cannot unmap (including the case where the request is misaligned with respect to the granularity) it does a write. > Another question is, if the specification of WRITE SAME + UNMAP would be > made unambiguous in the SBC document, whether or not we should take the risk > to trigger behavior that is not what we expect by sending unaligned WRITE > SAME + UNMAP commands to SCSI devices? Yes, I think we should. Paolo
Re: RFC: always use REQ_OP_WRITE_ZEROES for zeroing offload
On 23/03/2017 23:53, Lars Ellenberg wrote: > Thin does not claim to zero data on discard. which is ok, and correct, > because it only punches holes on full chunks (or whatever you call > them), and leaves the rest in the mapping tree as is. > > And that behaviour would prevent DRBD from exposing discards if > configured on top of thin. (see above) > > But thin *could* easily guarantee zeroing, by simply punching holes > where it can, and zeroing out the not fully-aligned partial start and > end of the range. That's the difference between REQ_OP_DISCARD (only punches holes on full chunks) and REQ_OP_WRITE_ZEROES with the REQ_UNMAP flag (punches holes + zeroes incomplete chunks). dm-thinp's REQ_OP_DISCARD should not do anything for unaligned parts. Instead, layers above should use REQ_OP_WRITE_ZEROES (with or without REQ_UNMAP, as required) if they need zeroes. dm-thinp would have to split off the partial chunks, and zero them in the lower-level device with REQ_OP_WRITE_ZEROES. Paolo
Re: [PATCH 12/23] sd: handle REQ_UNMAP
On 28/03/2017 18:48, Bart Van Assche wrote: >> +if (rq->cmd_flags & REQ_UNMAP) { >> +switch (sdkp->provisioning_mode) { >> +case SD_LBP_WS16: >> +return sd_setup_write_same16_cmnd(cmd, true); >> +case SD_LBP_WS10: >> +return sd_setup_write_same10_cmnd(cmd, true); >> +} >> +} >> + >> if (sdp->no_write_same) >> return BLKPREP_INVALID; >> if (sdkp->ws16 || sector > 0x || nr_sectors > 0x) > Users can change the provisioning mode from user space from SD_LBP_WS16 into > SD_LBP_WS10 so I'm not sure it's safe to skip the (sdkp->ws16 || sector > > 0x || nr_sectors > 0x) check if REQ_UNMAP is set. Yeah, if REQ_UNMAP is set you should probably check sdkp->provisioning_mode instead of sdkp->ws16, but apart from this it should still go through the checks below. Plus, if the provisioning mode is not ws10 or ws16, should sd_setup_write_zeroes_cmnd: 1) do a WRITE SAME without UNMAP (what Christoph's code does) 2) return BLKPREP_INVALID 3) ignore provisioning mode and do a WRITE SAME with UNMAP 4) do a WRITE SAME without UNMAP for SD_LBP_{ZERO,FULL,DISABLE}, do a WRITE SAME with UNMAP for SD_LBP_{WS10,WS16,UNMAP}. I'm in favor of (4). The distinction between SD_LBP_UNMAP, SD_LBP_WS10 and SD_LBP_WS16 is as problematic as discard_zeroes_data in my opinion. Thanks, Paolo
Re: [PATCH 23/23] block: remove the discard_zeroes_data flag
On 28/03/2017 19:00, Bart Van Assche wrote: > On Thu, 2017-03-23 at 10:33 -0400, Christoph Hellwig wrote: >> Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can >> kill this hack. >> >> [ ... ] >> >> diff --git a/Documentation/ABI/testing/sysfs-block >> b/Documentation/ABI/testing/sysfs-block >> index 2da04ce6aeef..dea212db9df3 100644 >> --- a/Documentation/ABI/testing/sysfs-block >> +++ b/Documentation/ABI/testing/sysfs-block >> @@ -213,14 +213,8 @@ What: >> /sys/block//queue/discard_zeroes_data >> Date: May 2011 >> Contact:Martin K. Petersen>> Description: >> -Devices that support discard functionality may return >> -stale or random data when a previously discarded block >> -is read back. This can cause problems if the filesystem >> -expects discarded blocks to be explicitly cleared. If a >> -device reports that it deterministically returns zeroes >> -when a discarded area is read the discard_zeroes_data >> -parameter will be set to one. Otherwise it will be 0 and >> -the result of reading a discarded area is undefined. >> +Will always return 0. Don't rely on any specific behavior >> +for discards, and don't read this file. >> >> What: /sys/block//queue/write_same_max_bytes >> Date: January 2012 >> >> [ ... ] >> >> --- a/block/blk-sysfs.c >> +++ b/block/blk-sysfs.c >> @@ -208,7 +208,7 @@ static ssize_t queue_discard_max_store(struct >> request_queue *q, >> >> static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char >> *page) >> { >> -return queue_var_show(queue_discard_zeroes_data(q), page); >> +return 0; >> } > > Hello Christoph, > > It seems to me like the documentation in Documentation/ABI/testing/sysfs-block > and the above code are not in sync. I think the above code will cause reading > from the discard_zeroes_data attribute to return an empty string ("") instead > of "0\n". > > BTW, my personal preference is to remove this attribute entirely because > keeping > it will cause confusion, no matter how well we document the behavior of this > attribute. If you remove it, you should probably remove the BLKDISCARDZEROES ioctl too. That said, the issue with discard_zeroes_data is that it is badly defined; it was defined as "if I unmap X, will it read as zeroes?" but this is not how the SCSI standard defines e.g. the UNMAP command with LBPRZ=1. But knowing something like LBPRZ ("if something is unmapped, will it read as zeroes?") _would_ actually be useful for userspace. This will be especially true once sd maps lseek(SEEK_HOLE/SEEK_DATA) to the SCSI GET LBA STATUS command, or once dm-thin supports them. Secondarily, if the former returns 1, userspace is also interested in knowing "can REQ_OP_WRITE_ZEROES+REQ_UNMAP ever unmap anything?", i.e. whether BLKDEV_ZERO_NOFALLBACK will ever return anything but -EOPNOTSUPP. For SCSI, this should intuitively mean whether LBPWS or LBPWS10 are set, but the details depend on how the sd driver implements REQ_OP_WRITE_ZEROES with REQ_UNMAP. Paolo
Re: [PATCH 03/23] sd: implement REQ_OP_WRITE_ZEROES
On 28/03/2017 20:50, Bart Van Assche wrote: > > This means that just like the start and end of a discard must be aligned on a > discard_granularity boundary, WRITE SAME commands with the UNMAP bit set must > also respect that granularity. I think this means that either > __blkdev_issue_zeroout() has to be modified such that it rejects unaligned > REQ_OP_WRITE_ZEROES operations or that blk_bio_write_same_split() has to be > modified such that it generates REQ_OP_WRITEs for the unaligned start and > tail. I don't think this is the case. Rather, Linux should try to align the WRITE SAME commands to the optimal unmap granularity if the zeroed area requires performing more than one WRITE SAME command (i.e. > maximum write same length or too large to fit in the CDB). However, even in that case it can use WRITE SAME with UNMAP for the unaligned start and tail. Unlike the UNMAP command, the SCSI standard does guarantee that zeroes are written in the unaligned parts. Paolo
Re: [PATCH 2/2] virtio_scsi: Implement fc_host
On 16/01/2017 18:26, Fam Zheng wrote: >> Is the endianness correct for big-endian host here? > > I think so. The fc_host sysfs uses u64 to represent port_name and node_name, > this patch does the same, so using virtio_* helpers for these fields should > handle the endianness correctly. I was suspicious about it because they are defined as "u8 x[8]" in the virtio_scsi_config struct. So you would need to read with virtio_cread_bytes and pass the result to wwn_to_u64. For example, if you have 0x500123456789abcd this would be 0x50 0x01 0x23 0x45 0x67 0x89 0xab 0cd in virtio_scsi_config, and then virtio_cread64 would read it as a little-endian u64, 0xcdab896745230150. Maybe your QEMU patch is also writing things as little-endian 64-bit integers, rather than 8-element arrays of bytes? Paolo > Maybe we should use u64 in struct virtio_scsi_config as well? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] virtio-scsi: Implement FC_HOST feature
> How it this supposed to work? > You do export the WWPN/WWNN of the associated host to the guest (nb: > will get interesting for non NPIV setups ...), but virtio scsi will > still do a LUN remapping. > IE the LUNs you see on the host will be different from the LUNs > presented to the guest. This is taken care of in the host by presenting to the host all LUNs from a host's NPIV vHBA. (Libvirt probably would be the one taking care of this, because QEMU may not have enough permissions). > Plus you don't _actually_ expose the FC host, but rather the WWPN of the > host presenting the LUN. > So how do you handle LUNs from different FC hosts on the guest? I'm not sure I understand. Neither I nor Fam know this stuff very well, but we are trying to do the same as Hyper-V (and other proprietary hypervisors too). > Overall, I'm not overly happy with this approach. > You already added WWPN ids to the virtio transport, so why didn't you > update the LUN field, too, to avoid this ominous LUN remapping? Is this your old idea of adding a separate target field to commands, in order to support 64-bit LUNs? That is separate, and most FC drivers only default to 16-bit LUNs anyway. > And we really should make sure to have a single FC host in the guest > presenting all LUNs. Yes, of course. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] virtio_scsi: Implement fc_host
On 16/01/2017 17:04, Fam Zheng wrote: > + node_name = virtio_cread64(vdev, > + offsetof(struct virtio_scsi_config, primary_wwnn)); > + port_name = virtio_cread64(vdev, > + offsetof(struct virtio_scsi_config, primary_wwpn)); > + } else { > + node_name = virtio_cread64(vdev, > + offsetof(struct virtio_scsi_config, secondary_wwnn)); > + port_name = virtio_cread64(vdev, > + offsetof(struct virtio_scsi_config, secondary_wwpn)); Is the endianness correct for big-endian host here? Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ibmvscsi: add write memory barrier to CRQ processing
On 08/12/2016 00:31, Tyrel Datwyler wrote: > The first byte of each CRQ entry is used to indicate whether an entry is > a valid response or free for the VIOS to use. After processing a > response the driver sets the valid byte to zero to indicate the entry is > now free to be reused. Add a memory barrier after this write to ensure > no other stores are reordered when updating the valid byte. > > Signed-off-by: Tyrel Datwyler> --- > drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c > b/drivers/scsi/ibmvscsi/ibmvscsi.c > index d9534ee..2f5b07e 100644 > --- a/drivers/scsi/ibmvscsi/ibmvscsi.c > +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c > @@ -232,6 +232,7 @@ static void ibmvscsi_task(void *data) > while ((crq = crq_queue_next_crq(>queue)) != NULL) { > ibmvscsi_handle_crq(crq, hostdata); > crq->valid = VIOSRP_CRQ_FREE; > + wmb(); > } > > vio_enable_interrupts(vdev); > @@ -240,6 +241,7 @@ static void ibmvscsi_task(void *data) > vio_disable_interrupts(vdev); > ibmvscsi_handle_crq(crq, hostdata); > crq->valid = VIOSRP_CRQ_FREE; > + wmb(); Should this driver use virt_wmb instead? Paolo > } else { > done = 1; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] mm: remove write/force parameters from __get_user_pages_unlocked()
= 0; > unsigned long max_pages_per_loop = PVM_MAX_KMALLOC_PAGES > / sizeof(struct pages *); > + unsigned int flags = FOLL_REMOTE; > > /* Work out address and page range required */ > if (len == 0) > return 0; > nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1; > > + if (vm_write) > + flags |= FOLL_WRITE; > + > while (!rc && nr_pages && iov_iter_count(iter)) { > int pages = min(nr_pages, max_pages_per_loop); > size_t bytes; > @@ -104,8 +108,7 @@ static int process_vm_rw_single_vec(unsigned long addr, >* current/current->mm >*/ > pages = __get_user_pages_unlocked(task, mm, pa, pages, > - vm_write, 0, process_pages, > - FOLL_REMOTE); > + process_pages, flags); > if (pages <= 0) > return -EFAULT; > > diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c > index db96688..8035cc1 100644 > --- a/virt/kvm/async_pf.c > +++ b/virt/kvm/async_pf.c > @@ -84,7 +84,8 @@ static void async_pf_execute(struct work_struct *work) >* mm and might be done in another context, so we must >* use FOLL_REMOTE. >*/ > - __get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL, FOLL_REMOTE); > + __get_user_pages_unlocked(NULL, mm, addr, 1, NULL, > + FOLL_WRITE | FOLL_REMOTE); > > kvm_async_page_present_sync(vcpu, apf); > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 81dfc73..28510e7 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1416,10 +1416,15 @@ static int hva_to_pfn_slow(unsigned long addr, bool > *async, bool write_fault, > down_read(>mm->mmap_sem); > npages = get_user_page_nowait(addr, write_fault, page); > up_read(>mm->mmap_sem); > - } else > + } else { > + unsigned int flags = FOLL_TOUCH | FOLL_HWPOISON; > + > + if (write_fault) > + flags |= FOLL_WRITE; > + > npages = __get_user_pages_unlocked(current, current->mm, addr, > 1, > -write_fault, 0, page, > -FOLL_TOUCH|FOLL_HWPOISON); > +page, flags); > + } > if (npages != 1) > return npages; > > Acked-by: Paolo Bonzini <pbonz...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Remove the unnecessary semicolons
On 19/08/2016 04:03, Zhou Jie wrote: > At the end of funcions, semicolons are unnecessary. So drop them. > > Signed-off-by: Chao Fan <fanc.f...@cn.fujitsu.com> > --- > drivers/scsi/virtio_scsi.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c > index 7dbbb29..9632a0c 100644 > --- a/drivers/scsi/virtio_scsi.c > +++ b/drivers/scsi/virtio_scsi.c > @@ -241,7 +241,7 @@ static void virtscsi_req_done(struct virtqueue *vq) > struct virtio_scsi_vq *req_vq = >req_vqs[index]; > > virtscsi_vq_done(vscsi, req_vq, virtscsi_complete_cmd); > -}; > +} > > static void virtscsi_poll_requests(struct virtio_scsi *vscsi) > { > @@ -267,7 +267,7 @@ static void virtscsi_ctrl_done(struct virtqueue *vq) > struct virtio_scsi *vscsi = shost_priv(sh); > > virtscsi_vq_done(vscsi, >ctrl_vq, virtscsi_complete_free); > -}; > +} > > static void virtscsi_handle_event(struct work_struct *work); > > @@ -413,7 +413,7 @@ static void virtscsi_event_done(struct virtqueue *vq) > struct virtio_scsi *vscsi = shost_priv(sh); > > virtscsi_vq_done(vscsi, >event_vq, virtscsi_complete_event); > -}; > +} > > /** > * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue > Acked-by: Paolo Bonzini <pbonz...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist
On 06/06/2016 17:47, John Snow wrote: > > > Various downstreams may have backported the VPD fix to older versions, > > > we need to be careful not to block those, too ... so targeting the core > > > behavior seems like the more strictly correct, easily maintainable > > > solution. > > > > I think this is not practical. I'm okay with the big hammer if an > > algorithmic fix is not feasible; but otherwise it does seem a better > > idea than blacklisting based on inquiry data... > > You think the more practical solution is a SCSI driver that can hang > because of an incorrect/missing response and to maintain a carefully > curated blacklist to work around this behavior? The best solution would be an algorithmic fix, perhaps predicated by some kind of quirk bit. A carefully curated blacklist is impossible because you cannot account for a zillion downstreams, most of which probably don't change the inquire vendor/product data; version numbers are awful. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist
On 06/06/2016 17:41, John Snow wrote: > On 06/06/2016 11:05 AM, Paolo Bonzini wrote: >> For ATAPI, you have to blacklist all versions up to 2.2 inclusive. >> >> This gives: >> >> - QEMU / QEMU CD-ROM / 0.8.(this is IDE and SCSI) >> - QEMU / QEMU CD-ROM / 0.9.(this is IDE and SCSI) >> - QEMU / QEMU CD-ROM / 0.10(this is SCSI only) >> - QEMU / QEMU CD-ROM / 0.11(this is SCSI only) >> - QEMU / QEMU DVD-ROM / 0.8. (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.9. (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.10 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.11 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.12 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.13 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.14 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 0.15 (this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.0(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.1(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.2(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.3(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.4(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.5(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.6(this is IDE only) >> - QEMU / QEMU DVD-ROM / 1.7(this is IDE only) >> - QEMU / QEMU DVD-ROM / 2.0(this is IDE only) >> - QEMU / QEMU DVD-ROM / 2.1(this is IDE only) >> - QEMU / QEMU DVD-ROM / 2.2(this is IDE only) >> > > If this bug is caused by a missing VPD response, Paolo's version history > here is correct for upstream versions. > > Various downstreams may have backported the VPD fix to older versions, > we need to be careful not to block those, too ... so targeting the core > behavior seems like the more strictly correct, easily maintainable solution. I think this is not practical. I'm okay with the big hammer if an algorithmic fix is not feasible; but otherwise it does seem a better idea than blacklisting based on inquiry data... Thanks, Paolo > Why not just dynamically blacklist devices that fail to respond to VPD > inquiries? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist
On 06/06/2016 16:22, Hannes Reinecke wrote: > So either we dig into what went wrong with qemu 0.8, or we figure out > from which qemu version things start to behave nicely, and blacklist > earlier versions. > > > > Either way, this patch is wrong. > > > > If we can identify which versions work, we can update it. Otherwise > > I think we have to be conservative. > > So far we just had this single report where the upstream kernel didn't > work correctly with a (really old) version of qemu. > Hardly justifying blacklisting qemu CD-ROM in general. To further complicate the matter there are two QEMU MMC devices: 1) ATAPI - vendor "QEMU" / product name "QEMU CD-ROM" before QEMU 0.10.0 - vendor "QEMU" / product name "QEMU DVD-ROM" since QEMU 0.10.0 2) native SCSI - vendor "QEMU" / product name "QEMU CD-ROM" VPD in the SCSI CD-ROM probably has always worked, but I would blacklist up to 0.11 inclusive just to be safe. Those versions are dead anyway. VPD in the ATAPI CD-ROM is newer, and that's where the bug was reported on: > [4.439488] ata2.00: ATAPI: QEMU CD-ROM, 0.8.2, max UDMA/100 > [4.443649] ata2.00: configured for MWDMA2 > [4.450267] scsi 1:0:0:0: CD-ROMQEMU QEMU CD-ROM 0.8. > PQ: 0 ANSI: 5 > [4.464317] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 > frozen > [4.464319] ata2.00: BMDMA stat 0x5 > [4.464339] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma > 16640 in > [4.464339] Inquiry 12 01 00 00 ff 00res > 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) > [4.464341] ata2.00: status: { DRDY DRQ } For ATAPI, you have to blacklist all versions up to 2.2 inclusive. This gives: - QEMU / QEMU CD-ROM / 0.8.(this is IDE and SCSI) - QEMU / QEMU CD-ROM / 0.9.(this is IDE and SCSI) - QEMU / QEMU CD-ROM / 0.10(this is SCSI only) - QEMU / QEMU CD-ROM / 0.11(this is SCSI only) - QEMU / QEMU DVD-ROM / 0.8. (this is IDE only) - QEMU / QEMU DVD-ROM / 0.9. (this is IDE only) - QEMU / QEMU DVD-ROM / 0.10 (this is IDE only) - QEMU / QEMU DVD-ROM / 0.11 (this is IDE only) - QEMU / QEMU DVD-ROM / 0.12 (this is IDE only) - QEMU / QEMU DVD-ROM / 0.13 (this is IDE only) - QEMU / QEMU DVD-ROM / 0.14 (this is IDE only) - QEMU / QEMU DVD-ROM / 0.15 (this is IDE only) - QEMU / QEMU DVD-ROM / 1.0(this is IDE only) - QEMU / QEMU DVD-ROM / 1.1(this is IDE only) - QEMU / QEMU DVD-ROM / 1.2(this is IDE only) - QEMU / QEMU DVD-ROM / 1.3(this is IDE only) - QEMU / QEMU DVD-ROM / 1.4(this is IDE only) - QEMU / QEMU DVD-ROM / 1.5(this is IDE only) - QEMU / QEMU DVD-ROM / 1.6(this is IDE only) - QEMU / QEMU DVD-ROM / 1.7(this is IDE only) - QEMU / QEMU DVD-ROM / 2.0(this is IDE only) - QEMU / QEMU DVD-ROM / 2.1(this is IDE only) - QEMU / QEMU DVD-ROM / 2.2(this is IDE only) Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Application error handling with write-back caching
On 10/05/2016 19:31, James Bottomley wrote: > > What about a SPACE ALLOCATION FAILED error or a similar error that > > can be fixed by administrator actions (or just by a concurrent > > process doing an UNMAP)? Would a subsequent cache flush cause data > > loss? > > You're now asking about how these are handled? It's not a SCSI > problem. I believe if you look at the various layers, RAID would still > treat it as fatal and fail the drive and so would most filesystems. > The AEN warnings for TP are reported, but the admin has to sort it out > before they become a fatal error. Thanks, fatal errors are fine I guess. We were worried that the next SYNCHRONIZE CACHE would succeed and throw away the writes because it has already "performed a write medium operation". POSIX fsync is pretty underspecified in this respect too; gluster has been throwing away those writes for a long time! It stopped now because we explained the issue to them, but it's pointless if the next layer below does the same---hence Stefan's question. (In our case the next layer is not the page cache, because we generally use O_DIRECT. Evicting dirty pages from the page cache would be okay if the process(es) that wrote them are SIGKILLed, but in general it would be a problem too). Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Application error handling with write-back caching
On 10/05/2016 16:16, James Bottomley wrote: > > If "is performed" just means "completes", maybe with an error, the > > application would have to resubmit write requests and then try to > > flush the write cache again. > > > > I'm not aware of applications that keep acknowledged write data > > around until the cache flush completion in order to retry writes. > > I think you may be misunderstanding the nature of the returned error. > It will be permanent and fatal and usually signal that the device has > a failed sector that can't be remapped and so the device itself has for > most purposes failed. The only recovery is if you happen to have RAID, > in which case the RAID layer will mostly take care of it. What about a SPACE ALLOCATION FAILED error or a similar error that can be fixed by administrator actions (or just by a concurrent process doing an UNMAP)? Would a subsequent cache flush cause data loss? Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/4] scsi: cleanup ioctl headers and provide UAPI versions
On 25/09/2015 11:27, Paolo Bonzini wrote: > This is v3 of the series to provide an "official" sg.h header (and > scsi_ioctl.h too, though it's basically obsolete) together with the other > userspace API definitions. The change from v2 to v3 is that defaults > for sg.c are not exported in include/uapi/linux/sg.c. > > Paolo > > 2.5.0 > What happened to these patches?... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 02/11/2015 16:05, Mike Snitzer wrote: > > In any case, if we don't start path activation we should return > > ENOTCONN, not ENOTTY. > > Currently, if we don't start path activation we're returning EIO. > ENOTCONN is used for when we do start path activation (and ENOTCONN is > the means for DM core to retry) > > We _could_ change the ENOTCONN to be EAGAIN and EIO to ENOTCONN... This makes sense... though of course testing the impact of this on userspace is going to be hard. :( Chances are that userspace is not expecting EAGAIN either. Even if they did, how would someone know that they can now retry the ioctl after getting EAGAIN? Should they just do it in a loop? Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 02/11/2015 14:56, Hannes Reinecke wrote: > But then the real question remains: > > What is the 'correct' behaviour for ioctls when no path retry > is active (or when no paths are present)? > > Should we start path activation? > If so, should we wait for path activation to finish, risking udev > killing the worker for that event (udev has a built-in timeout of > 120 seconds, which we might easily exceed when we need to activate > paths for large installations or slow path activation ... just > thinking of NetApp takeover/giveback cycle)? > If we're not waiting for path activation, where would be the point > in starting it, seeing that we're not actually interested in the result? > And if we shouldn't start a path activation, what is the point of > having code for it in the first place? That's a fair question, and it depends on what said udev worker actually does. In any case, if we don't start path activation we should return ENOTCONN, not ENOTTY. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 02/11/2015 08:28, Hannes Reinecke wrote: > > With the original code we would need to wait for path activation > before we would be able to figure out if the ioctl is valid. > However, the callback to verify the ioctl is sd_ioctl(), which as > a first step calls scsi_verify_ioctl(). > So my reasoning was that we can as well call scsi_verify_ioctl() > early, and allow it to filter out known invalid ioctls. > Then we wouldn't need to wait for path activation and return > immediately. That in principle makes sense. Unfortunately, before path activation you can only find out if a ioctl is valid. You cannot find out which ioctls are _in_valid, because path activation sets the bdev and that might make all ioctls valid. > Incidentally, in the 'r == -ENOTCONN' case, we're waiting > for path activation, but then just bail out with -ENOTCONN. > As we're not resetting -ENOTCONN, where's the point in activate the > paths here? > Shouldn't we retry to figure out if -ENOTCONN is still true? That makes sense too. You've done the work, might as well reap the benefits... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 31/10/2015 23:47, Mike Snitzer wrote: > Yes, with your commit ec8013be ("dm: do not forward ioctls from logical > volumes to the underlying device") you added protections to disallow > issuing ioctls to a partition that could impact the rest of the device. > > Given that I can see why you're seizing on the "ti->len != > i_size_read(bdev->bd_inode) >> SECTOR_SHIFT" negative checks that gate > the call to scsi_verify_blk_ioctl(). Right. > For Hannes, and in my head, it didn't matter if a future bdev satisfies > the length condition. I agree actually. The only problem is that the returned errno value is ENOTTY, and to userspace that "sounds like" a future bdev will not make the ioctl valid. > I could've sworn that unprivledged users (without CAP_SYS_RAWIO) > wouldn't be allowed to issue ioctls. Am I completely mistaken? They are allowed to issue ioctls. CAP_SYS_RAWIO changes that to also allow issuing of ioctls to partitions. That was required by Linus for backwards compatibility. > Or is > it still contentious and DM-mpath removing the ability to allow these > unprivledged ioctls (as a side-effect of Hannes' commit ec8013be) makes > your life, and other virt users' lives, harder? Yes, it would. virt runs as an unprivileged user (so does CD burning, which was the original reason to let SG_IO run by unprivileged users; there are probably other use cases). Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 31/10/2015 19:13, Mike Snitzer wrote: > > But that's wrong, I think. It's a false positive in > > scsi_verify_blk_ioctl(). > > > > If the ioctl is valid when bdev becomes non-NULL (and it will be if > > ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT), > > you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't > > think the ioctls can go away and come back. So Hannes's patch broke the > > userspace ABI. :( > > Huh? All that Hannes' patch did was add early verification of the ioctl > if there are no paths, since: there is no point queueing an ioctl that > is invalid. > > [snip discussion of Christoph's patches] > > The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid. > It has nothing to do with the existance of a bdev or not; but everything > to do with the unprivledged user's request to issue an ioctl. ... but the call is skipped (and all ioctls are valid) if ti->len == i_size_read(bdev->bd_inode) >> SECTOR_SHIFT. Therefore, until you have the bdev you don't know which ioctls are valid, and you must assume all of them are. You can't do anything unsafe anyway until you have the bdev. This is the reasoning prior to Hannes's change. Afterwards, you end up calling scsi_verify_blk_ioctl() when bdev == NULL. If the future bdev satisfies the above condition on ti->len, this means that ioctl(SG_IO) switches from ENOTTY to available. Userspace is clearly not expecting that. > Paolo, AFAIK unprivledged ioctls is one of your pet-projects so your > insight on what, if anything, needs changing to support them is the > insight I think we need. I hope the above provides some extra information. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
On 29/10/2015 14:18, Mike Snitzer wrote: > > 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285); > >it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl(). > > > > $ dmesg > > <...> > > [] device-mapper: multipath: Failing path 65:144. > > [] device-mapper: multipath: Failing path 67:144. > > [] device-mapper: multipath: Failing path 65:224. > > [] device-mapper: multipath: Failing path 68:32. > > [] sgio_inquiry: sending ioctl 2285 to a partition! > > So scsi_verify_blk_ioctl() considers the ioctl invalid. But that's wrong, I think. It's a false positive in scsi_verify_blk_ioctl(). If the ioctl is valid when bdev becomes non-NULL (and it will be if ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT), you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't think the ioctls can go away and come back. So Hannes's patch broke the userspace ABI. :( Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
On 17/09/2015 17:32, Bart Van Assche wrote: > >> + >> +#ifndef __KERNEL__ >> +/* Keep in sync with SG_DEFAULT_TIMEOUT of scsi/sg.h */ >> #define SG_DEFAULT_TIMEOUT(60*HZ) /* HZ == 'jiffies in 1 >> second' */ >> #endif > > Is it useful and/or necessary to export this constant ? To me this looks > like an implementation aspect rather than an aspect of the scsi-sg API. That's fine, I can remove it. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
Provide a UAPI version of the header in the kernel, making it easier for interested projects to use an up-to-date version of the header. The new headers are placed under uapi/linux/ so as not to conflict with the glibc-provided headers in /usr/include/scsi. /dev/sgN default values are implementation aspects, and are moved to drivers/scsi/sg.c instead (together with e.g. SG_ALLOW_DIO_DEF). However, SG_SCATTER_SZ is used by Wine so it is kept in linux/sg.h SG_MAX_QUEUE could also be useful. struct scsi_ioctl_command and struct scsi_idlun used to be under "#ifdef __KERNEL__", but they are actually useful for userspace as well. Add them to the new header. Cc: James Bottomley <jbottom...@parallels.com> Cc: Christoph Hellwig <h...@lst.de> Cc: linux-scsi@vger.kernel.org Cc: Bart Van Assche <bart.vanass...@sandisk.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- drivers/scsi/sg.c | 8 +- include/scsi/scsi_ioctl.h | 50 +- include/scsi/sg.h | 261 +- include/uapi/linux/Kbuild | 2 + include/{scsi => uapi/linux}/scsi_ioctl.h | 23 +-- include/{scsi => uapi/linux}/sg.h | 40 + 6 files changed, 25 insertions(+), 359 deletions(-) copy include/{scsi => uapi/linux}/scsi_ioctl.h (77%) copy include/{scsi => uapi/linux}/sg.h (91%) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 9d7b7db75e4b..3410e50a37d0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -90,7 +90,14 @@ static void sg_proc_cleanup(void); */ #define MULDIV(X,MUL,DIV) X % DIV) * MUL) / DIV) + ((X / DIV) * MUL)) +#define SG_DEFAULT_TIMEOUT_USER(60*USER_HZ) /* HZ == 'jiffies in 1 second' */ #define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) +#define SG_DEFAULT_RETRIES 0 +#define SG_DEF_FORCE_LOW_DMA 0 +#define SG_DEF_FORCE_PACK_ID 0 +#define SG_DEF_KEEP_ORPHAN 0 +#define SG_DEF_COMMAND_Q 0 +#define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ int sg_big_buff = SG_DEF_RESERVED_SIZE; /* N.B. This variable is readable and writeable via diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index 2bd9d67c201a..d54f9db2e079 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -1,60 +1,12 @@ #ifndef _SCSI_IOCTL_H #define _SCSI_IOCTL_H -#define SCSI_IOCTL_SEND_COMMAND 1 -#define SCSI_IOCTL_TEST_UNIT_READY 2 -#define SCSI_IOCTL_BENCHMARK_COMMAND 3 -#define SCSI_IOCTL_SYNC 4 /* Request synchronous parameters */ -#define SCSI_IOCTL_START_UNIT 5 -#define SCSI_IOCTL_STOP_UNIT 6 -/* The door lock/unlock constants are compatible with Sun constants for - the cdrom */ -#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ -#define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ - -/* - * Here are some obsolete SCSI-specific ioctl commands. - * - * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 - */ - -/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ -#define SCSI_IOCTL_GET_IDLUN 0x5382 - -/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ - -/* Used to obtain the host number of a device. */ -#define SCSI_IOCTL_PROBE_HOST 0x5385 - -/* Used to obtain the bus number for a device */ -#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 - -/* Used to obtain the PCI location of a device */ -#define SCSI_IOCTL_GET_PCI 0x5387 - -#ifdef __KERNEL__ +#include struct scsi_device; -/* - * Structures used for scsi_ioctl et al. - */ - -typedef struct scsi_ioctl_command { - unsigned int inlen; - unsigned int outlen; - unsigned char data[0]; -} Scsi_Ioctl_Command; - -typedef struct scsi_idlun { - __u32 dev_id; - __u32 host_unique_id; -} Scsi_Idlun; - - int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int cmd, bool ndelay); extern int scsi_ioctl(struct scsi_device *, int, void __user *); -#endif /* __KERNEL__ */ #endif /* _SCSI_IOCTL_H */ diff --git a/include/scsi/sg.h b/include/scsi/sg.h index 370c78c37926..f9d3d1dace41 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -2,267 +2,8 @@ #define _SCSI_GENERIC_H #include +#include -/* - * History: - * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. - * Development Sponsored by Killy Corp. NY NY - * - * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard - * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert - * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. - * - * Documentation - * = - * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] - * The documentation for the sg
[PATCH v3 2/4] scsi: cleanup scsi/scsi_ioctl.h
SCSI_REMOVAL_* goes together with other SCSI command constants in include/scsi/scsi.h. It is also used outside the implementation of the ioctls (and it is not part of the user API). scsi_fctargaddress/Scsi_FCTargAddress has had no in-tree use since commit ca61f10ab2b8 ("[SCSI] remove broken driver cpqfc", 2005-10-29). Remove it, just in time for the the tenth anniversary of its demise. Cc: James Bottomley <jbottom...@parallels.com> Cc: Christoph Hellwig <h...@lst.de> Cc: linux-scsi@vger.kernel.org Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/scsi.h | 6 ++ include/scsi/scsi_ioctl.h | 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index e0a3398b1547..5e2bafdbd96f 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -279,6 +279,12 @@ static inline int scsi_is_wlun(u64 lun) #define SCSI_INQ_PQ_NOT_CON 0x01 #define SCSI_INQ_PQ_NOT_CAP 0x03 +/* + * PREVENT/ALLOW MEDIUM REMOVAL + */ +#defineSCSI_REMOVAL_PREVENT1 +#defineSCSI_REMOVAL_ALLOW 0 + /* * Here are some scsi specific ioctl commands which are sometimes useful. diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index 8d19d1d233c3..c81962bef7a0 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -12,9 +12,6 @@ #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ #define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ -#defineSCSI_REMOVAL_PREVENT1 -#defineSCSI_REMOVAL_ALLOW 0 - #ifdef __KERNEL__ struct scsi_device; @@ -34,11 +31,6 @@ typedef struct scsi_idlun { __u32 host_unique_id; } Scsi_Idlun; -/* Fibre Channel WWN, port_id struct */ -typedef struct scsi_fctargaddress { - __u32 host_port_id; - unsigned char host_wwn[8]; // include NULL term. -} Scsi_FCTargAddress; int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int cmd, bool ndelay); -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/4] scsi: cleanup ioctl headers and provide UAPI versions
This is v3 of the series to provide an "official" sg.h header (and scsi_ioctl.h too, though it's basically obsolete) together with the other userspace API definitions. The change from v2 to v3 is that defaults for sg.c are not exported in include/uapi/linux/sg.c. Paolo 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/4] scsi: remove old-style type names from sg.h
These will not be exported by the new linux/sg.h header, and scsi/sg.h will not have any user API after linux/sg.h is created. Since they have no user in the kernel, they can be zapped. Cc: James Bottomley <jbottom...@parallels.com> Cc: Christoph Hellwig <h...@lst.de> Cc: linux-scsi@vger.kernel.org Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/sg.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/include/scsi/sg.h b/include/scsi/sg.h index 3afec7032448..370c78c37926 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -207,12 +207,6 @@ typedef struct sg_req_info { /* used by SG_GET_REQUEST_TABLE ioctl() */ #define SG_BIG_BUFF SG_DEF_RESERVED_SIZE/* for backward compatibility */ -/* Alternate style type names, "..._t" variants preferred */ -typedef struct sg_io_hdr Sg_io_hdr; -typedef struct sg_io_vec Sg_io_vec; -typedef struct sg_scsi_id Sg_scsi_id; -typedef struct sg_req_info Sg_req_info; - /* vv */ /* The older SG interface based on the 'sg_header' structure follows. */ -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/4] scsi: move all obsolete ioctls to scsi_ioctl.h
Some are in scsi.h. Keep them together in preparation for exposing them in UAPI headers. Cc: James Bottomley <jbottom...@parallels.com> Cc: Christoph Hellwig <h...@lst.de> Cc: linux-scsi@vger.kernel.org Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/scsi.h | 20 include/scsi/scsi_ioctl.h | 20 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 5e2bafdbd96f..a96df31af89e 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -286,26 +286,6 @@ static inline int scsi_is_wlun(u64 lun) #defineSCSI_REMOVAL_ALLOW 0 -/* - * Here are some scsi specific ioctl commands which are sometimes useful. - * - * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 - */ - -/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ -#define SCSI_IOCTL_GET_IDLUN 0x5382 - -/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ - -/* Used to obtain the host number of a device. */ -#define SCSI_IOCTL_PROBE_HOST 0x5385 - -/* Used to obtain the bus number for a device */ -#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 - -/* Used to obtain the PCI location of a device */ -#define SCSI_IOCTL_GET_PCI 0x5387 - /* Pull a u32 out of a SCSI message (using BE SCSI conventions) */ static inline __u32 scsi_to_u32(__u8 *ptr) { diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index c81962bef7a0..2bd9d67c201a 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -12,6 +12,26 @@ #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ #define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ +/* + * Here are some obsolete SCSI-specific ioctl commands. + * + * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 + */ + +/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ +#define SCSI_IOCTL_GET_IDLUN 0x5382 + +/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ + +/* Used to obtain the host number of a device. */ +#define SCSI_IOCTL_PROBE_HOST 0x5385 + +/* Used to obtain the bus number for a device */ +#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 + +/* Used to obtain the PCI location of a device */ +#define SCSI_IOCTL_GET_PCI 0x5387 + #ifdef __KERNEL__ struct scsi_device; -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
Provide a UAPI version of the header in the kernel, making it easier for interested projects to use an up-to-date version of the header. The new headers are placed under uapi/linux/ so as not to conflict with the glibc-provided headers in /usr/include/scsi. struct scsi_ioctl_command and struct scsi_idlun used to be under "#ifdef __KERNEL__", but they are actually useful for userspace as well. Add them to the new header. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/scsi_ioctl.h | 50 +- include/scsi/sg.h | 260 +- include/uapi/linux/Kbuild | 2 + include/{scsi => uapi/linux}/scsi_ioctl.h | 23 +-- include/{scsi => uapi/linux}/sg.h | 20 +-- 5 files changed, 20 insertions(+), 335 deletions(-) copy include/{scsi => uapi/linux}/scsi_ioctl.h (77%) copy include/{scsi => uapi/linux}/sg.h (97%) diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index 2bd9d67c201a..d54f9db2e079 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -1,60 +1,12 @@ #ifndef _SCSI_IOCTL_H #define _SCSI_IOCTL_H -#define SCSI_IOCTL_SEND_COMMAND 1 -#define SCSI_IOCTL_TEST_UNIT_READY 2 -#define SCSI_IOCTL_BENCHMARK_COMMAND 3 -#define SCSI_IOCTL_SYNC 4 /* Request synchronous parameters */ -#define SCSI_IOCTL_START_UNIT 5 -#define SCSI_IOCTL_STOP_UNIT 6 -/* The door lock/unlock constants are compatible with Sun constants for - the cdrom */ -#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ -#define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ - -/* - * Here are some obsolete SCSI-specific ioctl commands. - * - * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 - */ - -/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ -#define SCSI_IOCTL_GET_IDLUN 0x5382 - -/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ - -/* Used to obtain the host number of a device. */ -#define SCSI_IOCTL_PROBE_HOST 0x5385 - -/* Used to obtain the bus number for a device */ -#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 - -/* Used to obtain the PCI location of a device */ -#define SCSI_IOCTL_GET_PCI 0x5387 - -#ifdef __KERNEL__ +#include struct scsi_device; -/* - * Structures used for scsi_ioctl et al. - */ - -typedef struct scsi_ioctl_command { - unsigned int inlen; - unsigned int outlen; - unsigned char data[0]; -} Scsi_Ioctl_Command; - -typedef struct scsi_idlun { - __u32 dev_id; - __u32 host_unique_id; -} Scsi_Idlun; - - int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int cmd, bool ndelay); extern int scsi_ioctl(struct scsi_device *, int, void __user *); -#endif /* __KERNEL__ */ #endif /* _SCSI_IOCTL_H */ diff --git a/include/scsi/sg.h b/include/scsi/sg.h index 370c78c37926..1c80177b9793 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -2,267 +2,11 @@ #define _SCSI_GENERIC_H #include +#include -/* - * History: - * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. - * Development Sponsored by Killy Corp. NY NY - * - * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard - * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert - * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. - * - * Documentation - * = - * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] - * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html - * Also see: /Documentation/scsi/scsi-generic.txt - * - * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html - */ - -#ifdef __KERNEL__ extern int sg_big_buff; /* for sysctl */ -#endif - - -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory - or scatter gather list */ -unsigned char __user *cmdp; /*
[PATCH v2 2/4] scsi: cleanup scsi/scsi_ioctl.h
SCSI_REMOVAL_* goes together with other SCSI command constants in include/scsi/scsi.h. It is also used outside the implementation of the ioctls (and it is not part of the user API). scsi_fctargaddress/Scsi_FCTargAddress has had no in-tree use since commit ca61f10ab2b8 ("[SCSI] remove broken driver cpqfc", 2005-10-29). Remove it, just in time for the the tenth anniversary of its demise. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/scsi.h | 6 ++ include/scsi/scsi_ioctl.h | 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index e0a3398b1547..5e2bafdbd96f 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -279,6 +279,12 @@ static inline int scsi_is_wlun(u64 lun) #define SCSI_INQ_PQ_NOT_CON 0x01 #define SCSI_INQ_PQ_NOT_CAP 0x03 +/* + * PREVENT/ALLOW MEDIUM REMOVAL + */ +#defineSCSI_REMOVAL_PREVENT1 +#defineSCSI_REMOVAL_ALLOW 0 + /* * Here are some scsi specific ioctl commands which are sometimes useful. diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index 8d19d1d233c3..c81962bef7a0 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -12,9 +12,6 @@ #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ #define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ -#defineSCSI_REMOVAL_PREVENT1 -#defineSCSI_REMOVAL_ALLOW 0 - #ifdef __KERNEL__ struct scsi_device; @@ -34,11 +31,6 @@ typedef struct scsi_idlun { __u32 host_unique_id; } Scsi_Idlun; -/* Fibre Channel WWN, port_id struct */ -typedef struct scsi_fctargaddress { - __u32 host_port_id; - unsigned char host_wwn[8]; // include NULL term. -} Scsi_FCTargAddress; int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int cmd, bool ndelay); -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/4] scsi: move all obsolete ioctls to scsi/scsi_ioctl.h
Some are in scsi/scsi.h. Keep them together in preparation for exposing them in UAPI headers. Suggested-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/scsi.h | 20 include/scsi/scsi_ioctl.h | 20 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 5e2bafdbd96f..a96df31af89e 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -286,26 +286,6 @@ static inline int scsi_is_wlun(u64 lun) #defineSCSI_REMOVAL_ALLOW 0 -/* - * Here are some scsi specific ioctl commands which are sometimes useful. - * - * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 - */ - -/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ -#define SCSI_IOCTL_GET_IDLUN 0x5382 - -/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ - -/* Used to obtain the host number of a device. */ -#define SCSI_IOCTL_PROBE_HOST 0x5385 - -/* Used to obtain the bus number for a device */ -#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 - -/* Used to obtain the PCI location of a device */ -#define SCSI_IOCTL_GET_PCI 0x5387 - /* Pull a u32 out of a SCSI message (using BE SCSI conventions) */ static inline __u32 scsi_to_u32(__u8 *ptr) { diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index c81962bef7a0..2bd9d67c201a 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -12,6 +12,26 @@ #define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ #define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ +/* + * Here are some obsolete SCSI-specific ioctl commands. + * + * Note that include/linux/cdrom.h also defines IOCTL 0x5300 - 0x5395 + */ + +/* Used to obtain PUN and LUN info. Conflicts with CDROMAUDIOBUFSIZ */ +#define SCSI_IOCTL_GET_IDLUN 0x5382 + +/* 0x5383 and 0x5384 were used for SCSI_IOCTL_TAGGED_{ENABLE,DISABLE} */ + +/* Used to obtain the host number of a device. */ +#define SCSI_IOCTL_PROBE_HOST 0x5385 + +/* Used to obtain the bus number for a device */ +#define SCSI_IOCTL_GET_BUS_NUMBER 0x5386 + +/* Used to obtain the PCI location of a device */ +#define SCSI_IOCTL_GET_PCI 0x5387 + #ifdef __KERNEL__ struct scsi_device; -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/4] scsi: cleanup ioctl headers and provide UAPI versions
This is v2 of the series I sent a few days ago. A bit heavier on the cleanups: - export the structs for obsolete SCSI_IOCTL_SEND_COMMAND in uapi/linux/scsi_ioctl.h - reduce the kernel's scsi/scsi_ioctl.h even more - remove obsolete types from uapi/linux/sg.h - place all obsolete ioctls in uapi/linux/scsi_ioctl.h, even those formerly in scsi/scsi.h [requested by hch] ... and reverting to the flat uapi/linux/ structure instead of creating uapi/linux/scsi/. Paolo Paolo Bonzini (4): scsi: remove old-style type names from sg.h scsi: cleanup scsi/scsi_ioctl.h scsi: move all obsolete ioctls to scsi_ioctl.h scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h include/scsi/scsi.h | 20 +-- include/scsi/scsi_ioctl.h | 38 + include/scsi/sg.h | 266 +- include/uapi/linux/Kbuild | 2 + include/{scsi => uapi/linux}/scsi_ioctl.h | 41 +++-- include/{scsi => uapi/linux}/sg.h | 26 +-- 6 files changed, 38 insertions(+), 355 deletions(-) copy include/{scsi => uapi/linux}/scsi_ioctl.h (51%) copy include/{scsi => uapi/linux}/sg.h (95%) -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/4] scsi: remove old-style type names from sg.h
These will not be exported by the new linux/sg.h header, and scsi/sg.h will not have any user API after linux/sg.h is created. Since they have no user in the kernel, they can be zapped. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- include/scsi/sg.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/include/scsi/sg.h b/include/scsi/sg.h index 3afec7032448..370c78c37926 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -207,12 +207,6 @@ typedef struct sg_req_info { /* used by SG_GET_REQUEST_TABLE ioctl() */ #define SG_BIG_BUFF SG_DEF_RESERVED_SIZE/* for backward compatibility */ -/* Alternate style type names, "..._t" variants preferred */ -typedef struct sg_io_hdr Sg_io_hdr; -typedef struct sg_io_vec Sg_io_vec; -typedef struct sg_scsi_id Sg_scsi_id; -typedef struct sg_req_info Sg_req_info; - /* vv */ /* The older SG interface based on the 'sg_header' structure follows. */ -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
On 16/09/2015 15:39, Christoph Hellwig wrote: > Hi Paolo, > > can you just move them to include/linux/ directly? I'm trying to cater to the objections that were made to Andy's patch. If you mean the non-UAPI headers, James wanted them to stay in scsi/; if you mean the UAPI headers, Douglas complained about the flat structure of include/linux/. > Also scsi/scsi.h has some additional ioctl defintions that should be > added to the UAPI scsi_ioctl.h. Otherwise this looks ok to me. I can do this, but I think they are obsoleted by SG_GET_SCSI_ID, except for SCSI_IOCTL_GET_PCI which is horrible anyway. :) The comment about conflicts with CDROM ioctls is also interesting. I can see why one would want to keep them out of the new canonical place for SCSI ioctls. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
On 16/09/2015 16:23, Christoph Hellwig wrote: > > I'm trying to cater to the objections that were made to Andy's patch. > > If you mean the non-UAPI headers, James wanted them to stay in scsi/; if > > you mean the UAPI headers, Douglas complained about the flat structure > > of include/linux/. > > Yes, keep the non-UAPI ones as is and move the UAPI ones to linux/ - the flat > structure is a feature, not a bug. > > > > Also scsi/scsi.h has some additional ioctl defintions that should be > > > added to the UAPI scsi_ioctl.h. Otherwise this looks ok to me. > > > > I can do this, but I think they are obsoleted by SG_GET_SCSI_ID, except > > for SCSI_IOCTL_GET_PCI which is horrible anyway. :) The comment about > > conflicts with CDROM ioctls is also interesting. I can see why one > > would want to keep them out of the new canonical place for SCSI ioctls. > > No point in hiding them. We can still officially deprecate them, but I'd > much rather have them in a single place than hidden away somewhere. Ok, will do both. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h
Provide a UAPI version of the header in the kernel, making it easier for interested projects to use an up-to-date version of the header. The new headers are placed under uapi/linux/scsi/ so as not to conflict with the glibc-provided headers in /usr/include/scsi. Cc: Andy Grover <agro...@redhat.com> Cc: James Bottomley <jbottom...@parallels.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Douglas Gilbert <dgilb...@interlog.com> Cc: Linux SCSI List <linux-scsi@vger.kernel.org> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- Similar to Andy Grover's patch from last January, though developed independently. Based on the remarks to Andy's patch, I'm leaving scsi.h aside (Christoph), leaving the internal header in scsi/sg.h (James) and moving the headers to include/linux/scsi/ (Douglas). MAINTAINERS| 1 + include/scsi/scsi_ioctl.h | 17 +- include/scsi/sg.h | 267 + include/uapi/linux/Kbuild | 1 + include/uapi/linux/scsi/Kbuild | 2 + include/{ => uapi/linux}/scsi/scsi_ioctl.h | 36 +--- include/{ => uapi/linux}/scsi/sg.h | 18 +- 7 files changed, 17 insertions(+), 325 deletions(-) create mode 100644 include/uapi/linux/scsi/Kbuild copy include/{ => uapi/linux}/scsi/scsi_ioctl.h (41%) copy include/{ => uapi/linux}/scsi/sg.h (97%) diff --git a/MAINTAINERS b/MAINTAINERS index 67a4443daed9..b40764d43dde 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9173,6 +9173,7 @@ T:git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git S: Maintained F: drivers/scsi/ F: include/scsi/ +F: include/uapi/linux/scsi/ SCSI TAPE DRIVER M: Kai Mäkisara <kai.makis...@kolumbus.fi> diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h index 8d19d1d233c3..5b07a0a91675 100644 --- a/include/scsi/scsi_ioctl.h +++ b/include/scsi/scsi_ioctl.h @@ -1,21 +1,7 @@ #ifndef _SCSI_IOCTL_H #define _SCSI_IOCTL_H -#define SCSI_IOCTL_SEND_COMMAND 1 -#define SCSI_IOCTL_TEST_UNIT_READY 2 -#define SCSI_IOCTL_BENCHMARK_COMMAND 3 -#define SCSI_IOCTL_SYNC 4 /* Request synchronous parameters */ -#define SCSI_IOCTL_START_UNIT 5 -#define SCSI_IOCTL_STOP_UNIT 6 -/* The door lock/unlock constants are compatible with Sun constants for - the cdrom */ -#define SCSI_IOCTL_DOORLOCK 0x5380 /* lock the eject mechanism */ -#define SCSI_IOCTL_DOORUNLOCK 0x5381 /* unlock the mechanism */ - -#defineSCSI_REMOVAL_PREVENT1 -#defineSCSI_REMOVAL_ALLOW 0 - -#ifdef __KERNEL__ +#include struct scsi_device; @@ -44,5 +30,4 @@ int scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int cmd, bool ndelay); extern int scsi_ioctl(struct scsi_device *, int, void __user *); -#endif /* __KERNEL__ */ #endif /* _SCSI_IOCTL_H */ diff --git a/include/scsi/sg.h b/include/scsi/sg.h index 3afec7032448..87b35e7c4c31 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -1,274 +1,11 @@ #ifndef _SCSI_GENERIC_H #define _SCSI_GENERIC_H -#include +#include -/* - * History: - * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. - * Development Sponsored by Killy Corp. NY NY - * - * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard - * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert - * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. - * - * Documentation - * = - * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] - * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html - * Also see: /Documentation/scsi/scsi-generic.txt - * - * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html - */ - -#ifdef __KERNEL__ extern int sg_big_buff; /* for sysctl */ -#endif - - -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory -
Re: [PATCH v2] virtio_scsi: don't select CONFIG_BLK_DEV_INTEGRITY
On 27/04/2015 14:56, Christoph Hellwig wrote: T10 PI is just another optional feature, LLDDs should work without the infrastructure. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/Kconfig | 1 - drivers/scsi/virtio_scsi.c | 11 ++- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index b021bcb..896bea6 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -1743,7 +1743,6 @@ config SCSI_BFA_FC config SCSI_VIRTIO tristate virtio-scsi support depends on VIRTIO - select BLK_DEV_INTEGRITY help This is the virtual HBA driver for virtio. If the kernel will be used in a virtual machine, say Y or M. diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index f164f24..285f775 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -501,6 +501,7 @@ static void virtio_scsi_init_hdr(struct virtio_device *vdev, cmd-crn = 0; } +#ifdef CONFIG_BLK_DEV_INTEGRITY static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, struct virtio_scsi_cmd_req_pi *cmd_pi, struct scsi_cmnd *sc) @@ -524,6 +525,7 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, blk_rq_sectors(rq) * bi-tuple_size); } +#endif static int virtscsi_queuecommand(struct virtio_scsi *vscsi, struct virtio_scsi_vq *req_vq, @@ -546,11 +548,14 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi, BUG_ON(sc-cmd_len VIRTIO_SCSI_CDB_SIZE); +#ifdef CONFIG_BLK_DEV_INTEGRITY if (virtio_has_feature(vscsi-vdev, VIRTIO_SCSI_F_T10_PI)) { virtio_scsi_init_hdr_pi(vscsi-vdev, cmd-req.cmd_pi, sc); memcpy(cmd-req.cmd_pi.cdb, sc-cmnd, sc-cmd_len); req_size = sizeof(cmd-req.cmd_pi); - } else { + } else +#endif + { virtio_scsi_init_hdr(vscsi-vdev, cmd-req.cmd, sc); memcpy(cmd-req.cmd.cdb, sc-cmnd, sc-cmd_len); req_size = sizeof(cmd-req.cmd); @@ -1002,6 +1007,7 @@ static int virtscsi_probe(struct virtio_device *vdev) shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE; shost-nr_hw_queues = num_queues; +#ifdef CONFIG_BLK_DEV_INTEGRITY if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) { host_prot = SHOST_DIF_TYPE1_PROTECTION | SHOST_DIF_TYPE2_PROTECTION | SHOST_DIF_TYPE3_PROTECTION | SHOST_DIX_TYPE1_PROTECTION | @@ -1010,6 +1016,7 @@ static int virtscsi_probe(struct virtio_device *vdev) scsi_host_set_prot(shost, host_prot); scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC); } +#endif err = scsi_add_host(shost, vdev-dev); if (err) @@ -1090,7 +1097,9 @@ static struct virtio_device_id id_table[] = { static unsigned int features[] = { VIRTIO_SCSI_F_HOTPLUG, VIRTIO_SCSI_F_CHANGE, +#ifdef CONFIG_BLK_DEV_INTEGRITY VIRTIO_SCSI_F_T10_PI, +#endif }; static struct virtio_driver virtio_scsi_driver = { Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio_scsi: don't select CONFIG_BLK_DEV_INTEGRITY
On 23/04/2015 20:10, Christoph Hellwig wrote: T10 PI is just another optional feature, LLDDs should work without the infrastructure. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/Kconfig | 1 - drivers/scsi/virtio_scsi.c | 11 ++- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index b021bcb..896bea6 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -1743,7 +1743,6 @@ config SCSI_BFA_FC config SCSI_VIRTIO tristate virtio-scsi support depends on VIRTIO - select BLK_DEV_INTEGRITY help This is the virtual HBA driver for virtio. If the kernel will be used in a virtual machine, say Y or M. diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index f164f24..0f2bb5b 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -501,6 +501,7 @@ static void virtio_scsi_init_hdr(struct virtio_device *vdev, cmd-crn = 0; } +#ifdef CONFIG_BLK_DEV_INTEGRITY static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, struct virtio_scsi_cmd_req_pi *cmd_pi, struct scsi_cmnd *sc) @@ -524,6 +525,7 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, blk_rq_sectors(rq) * bi-tuple_size); } +#endif static int virtscsi_queuecommand(struct virtio_scsi *vscsi, struct virtio_scsi_vq *req_vq, @@ -546,11 +548,14 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi, BUG_ON(sc-cmd_len VIRTIO_SCSI_CDB_SIZE); +#ifdef CONFIG_BLK_DEV_INTEGRITY if (virtio_has_feature(vscsi-vdev, VIRTIO_SCSI_F_T10_PI)) { virtio_scsi_init_hdr_pi(vscsi-vdev, cmd-req.cmd_pi, sc); memcpy(cmd-req.cmd_pi.cdb, sc-cmnd, sc-cmd_len); req_size = sizeof(cmd-req.cmd_pi); - } else { + } else +#endif + { All good so far. virtio_scsi_init_hdr(vscsi-vdev, cmd-req.cmd, sc); memcpy(cmd-req.cmd.cdb, sc-cmnd, sc-cmd_len); req_size = sizeof(cmd-req.cmd); @@ -1002,6 +1007,7 @@ static int virtscsi_probe(struct virtio_device *vdev) shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE; shost-nr_hw_queues = num_queues; +#ifdef VIRTIO_SCSI_F_T10_PI This symbol is always defined; it is part of the uapi/ header. I think you wanted #ifdef CONFIG_BLK_DEV_INTEGRITY here as well. However, you can remove this #ifdef completely. This is because... if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) { ... this condition should always be false unless the feature is negotiated, and the feature should not be negotiated when CONFIG_BLK_DEV_INTEGRITY is disabled. Maybe you can add a WARN_ON(!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY)) depending on your taste. host_prot = SHOST_DIF_TYPE1_PROTECTION | SHOST_DIF_TYPE2_PROTECTION | SHOST_DIF_TYPE3_PROTECTION | SHOST_DIX_TYPE1_PROTECTION | Also, if you leave the #ifdef, host_prot is now unused if CONFIG_BLK_DEV_INTEGRITY is disabled. If you prefer to keep the #ifdef you should also change host_prot to a macro (e.g. VIRTIO_SCSI_HOST_PROT) to silence the compiler warning. @@ -1010,6 +1016,7 @@ static int virtscsi_probe(struct virtio_device *vdev) scsi_host_set_prot(shost, host_prot); scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC); } +#endif err = scsi_add_host(shost, vdev-dev); if (err) @@ -1090,7 +1097,9 @@ static struct virtio_device_id id_table[] = { static unsigned int features[] = { VIRTIO_SCSI_F_HOTPLUG, VIRTIO_SCSI_F_CHANGE, +#ifdef VIRTIO_SCSI_F_T10_PI This one definitely should be #ifdef CONFIG_BLK_DEV_INTEGRITY. Thanks, Paolo VIRTIO_SCSI_F_T10_PI, +#endif }; static struct virtio_driver virtio_scsi_driver = { -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Fix return code for ioctl( BLKRRPART ) if device is down
On 24/03/2015 11:16, Fam Zheng wrote: If issued right after link down, blockdev --rereadpt will be stuck for a while and then return normally. Although the underlying capacity and partition table are not correctly updated. And it means that userspace can't detect the error at all. Fix this by propargating the error of read capacity command through the stack, so that the ioctl could fail with -EIO. Fam Zheng (3): block: Return error in rescan_partitions if revalidating disk failed sd: Return error in sd_revalidate_disk if read capacity failed sd: Return -EIO if read capacity failed block/partition-generic.c | 6 +++--- drivers/scsi/sd.c | 22 +- 2 files changed, 16 insertions(+), 12 deletions(-) Reviewed-by: Paolo Bonzini pbonz...@redhat.com Though patch 3 could be seen as a change in userspace ABI, so I'm less sure about it. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 05/03/2015 14:37, Paolo Bonzini wrote: On 05/03/2015 14:33, Christoph Hellwig wrote: Any chance to get reviews for this series? Also we should at least expedite this first patch into 4.0-rc as it fixes scanning races in virtio_scsi. I reviewed 1 and 3, but I'm not really qualified for patch 2. Christoph, any news about these patches? Thanks, Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] scsi: proper state checking and module refcount handling in scsi_device_get
On 02/02/2015 14:01, Christoph Hellwig wrote: This effectively reverts commits 85b6c7 ([SCSI] sd: fix cache flushing on module removal (and individual device removal) and dc4515ea (scsi: always increment reference count). We now never call scsi_device_get from the shutdown path, and the fact that we started grabbing reference there in commit 85b6c7 turned out turned out to create more problems than it solves, and required workarounds for workarounds for workarounds. Move back to properly checking the device state and carefully handle module refcounting. Signed-off-by: Christoph Hellwig h...@lst.de --- drivers/scsi/scsi.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 9b38299..9b7fd0b 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode); * Description: Gets a reference to the scsi_device and increments the use count * of the underlying LLDD module. You must hold host_lock of the * parent Scsi_Host or already have a reference when calling this. + * + * This will fail if a device is deleted or cancelled, or when the LLD module + * is in the process of being unloaded. */ int scsi_device_get(struct scsi_device *sdev) { - if (sdev-sdev_state == SDEV_DEL) - return -ENXIO; + if (sdev-sdev_state == SDEV_DEL || sdev-sdev_state == SDEV_CANCEL) + goto fail; if (!get_device(sdev-sdev_gendev)) - return -ENXIO; - /* We can fail try_module_get if we're doing SCSI operations - * from module exit (like cache flush) */ - __module_get(sdev-host-hostt-module); - + goto fail; + if (!try_module_get(sdev-host-hostt-module)) + goto fail_put_device; return 0; + +fail_put_device: + put_device(sdev-sdev_gendev); +fail: + return -ENXIO; } EXPORT_SYMBOL(scsi_device_get); Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 02/02/2015 14:01, Christoph Hellwig wrote: Lock the device embedded in the scsi_device to protect against concurrent calls to -remove. Signed-off-by: Christoph Hellwig h...@lst.de Acked-by: Alan Stern st...@rowland.harvard.edu --- drivers/scsi/scsi_scan.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 983aed1..523faee 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device); void scsi_rescan_device(struct device *dev) { - if (!dev-driver) - return; - - if (try_module_get(dev-driver-owner)) { + device_lock(dev); + if (dev-driver try_module_get(dev-driver-owner)) { struct scsi_driver *drv = to_scsi_driver(dev-driver); if (drv-rescan) drv-rescan(dev); module_put(dev-driver-owner); } + device_unlock(dev); } EXPORT_SYMBOL(scsi_rescan_device); Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 05/03/2015 14:33, Christoph Hellwig wrote: Any chance to get reviews for this series? Also we should at least expedite this first patch into 4.0-rc as it fixes scanning races in virtio_scsi. I reviewed 1 and 3, but I'm not really qualified for patch 2. Paolo On Mon, Feb 02, 2015 at 02:01:24PM +0100, Christoph Hellwig wrote: Lock the device embedded in the scsi_device to protect against concurrent calls to -remove. Signed-off-by: Christoph Hellwig h...@lst.de Acked-by: Alan Stern st...@rowland.harvard.edu --- drivers/scsi/scsi_scan.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 983aed1..523faee 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device); void scsi_rescan_device(struct device *dev) { -if (!dev-driver) -return; - -if (try_module_get(dev-driver-owner)) { +device_lock(dev); +if (dev-driver try_module_get(dev-driver-owner)) { struct scsi_driver *drv = to_scsi_driver(dev-driver); if (drv-rescan) drv-rescan(dev); module_put(dev-driver-owner); } +device_unlock(dev); } EXPORT_SYMBOL(scsi_rescan_device); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ---end quoted text--- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 02/02/2015 13:59, Christoph Hellwig wrote: On Fri, Jan 30, 2015 at 10:46:17AM +0100, Paolo Bonzini wrote: Great, we might want to revert that patch in 3.21. Is that fix in any tree yet? Seems like I missed it for the scsi tree at least. So unless you want it for 3.19/stable we might as well ust skip that patch. Yes, I agree. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] lib/iovec: Add memcpy_fromiovec_out library function
On 30/01/2015 09:12, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch adds a new memcpy_fromiovec_out() library function which modifies the passed *iov following memcpy_fromiovec(), but also returns the next current iovec pointer via **iov_out. This is useful for vhost ANY_LAYOUT support when guests are allowed to generate incoming virtio request headers combined with subsequent SGL payloads into a single iovec. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- include/linux/uio.h | 2 ++ lib/iovec.c | 27 +++ 2 files changed, 29 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 1c5e453..3e4473d 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -136,6 +136,8 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i); int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len); +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, + struct iovec **iov_out, int len); int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov, int offset, int len); int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata, diff --git a/lib/iovec.c b/lib/iovec.c index 2d99cb4..6a813dd 100644 --- a/lib/iovec.c +++ b/lib/iovec.c @@ -28,6 +28,33 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len) EXPORT_SYMBOL(memcpy_fromiovec); /* + * Copy iovec to kernel, saving the current iov to *iov_out. + * Returns -EFAULT on error. Perhaps document that iov is modified, zeroing everything in [iov, *iov_out) and possibly removing the front of *iov_out? Paolo + */ + +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, + struct iovec **iov_out, int len) +{ + while (len 0) { + if (iov-iov_len) { + int copy = min_t(unsigned int, len, iov-iov_len); + if (copy_from_user(kdata, iov-iov_base, copy)) + return -EFAULT; + len -= copy; + kdata += copy; + iov-iov_base += copy; + iov-iov_len -= copy; + } + if (!iov-iov_len) + iov++; + } + *iov_out = iov; + + return 0; +} +EXPORT_SYMBOL(memcpy_fromiovec_out); + +/* * Copy kernel to iovec. Returns -EFAULT on error. */ -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 30/01/2015 02:08, Fam Zheng wrote: On Fri, 01/30 00:11, Paolo Bonzini wrote: On 29/01/2015 00:00, Christoph Hellwig wrote: Lock the device embedded in the scsi_device to protect against concurrent calls to -remove. Signed-off-by: Christoph Hellwig h...@lst.de I wonder if this makes this problem: https://lkml.org/lkml/2015/1/5/9 go away. A quick test says yes. Great, we might want to revert that patch in 3.21. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 29/01/2015 00:00, Christoph Hellwig wrote: Lock the device embedded in the scsi_device to protect against concurrent calls to -remove. Signed-off-by: Christoph Hellwig h...@lst.de I wonder if this makes this problem: https://lkml.org/lkml/2015/1/5/9 go away. Paolo --- drivers/scsi/scsi_scan.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 983aed1..523faee 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1593,16 +1593,15 @@ EXPORT_SYMBOL(scsi_add_device); void scsi_rescan_device(struct device *dev) { - if (!dev-driver) - return; - - if (try_module_get(dev-driver-owner)) { + device_lock(dev); + if (dev-driver try_module_get(dev-driver-owner)) { struct scsi_driver *drv = to_scsi_driver(dev-driver); if (drv-rescan) drv-rescan(dev); module_put(dev-driver-owner); } + device_unlock(dev); } EXPORT_SYMBOL(scsi_rescan_device); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] scsi: Fix max transfer length for 4k disks
On 29/01/2015 22:54, Brian King wrote: The following patch fixes an issue observed with 4k sector disks where the max_hw_sectors attribute was getting set too large in sd_revalidate_disk. Since sdkp-max_xfer_blocks is in units of SCSI logical blocks and queue_max_hw_sectors is in units of 512 byte blocks, on a 4k sector disk, every time we went through sd_revalidate_disk, we were taking the current value of queue_max_hw_sectors and increasing it by a factor of 8. Fix this by only shifting sdkp-max_xfer_blocks. Cc: stablesta...@vger.kernel.org Signed-off-by: Brian King brk...@linux.vnet.ibm.com --- drivers/scsi/sd.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff -puN drivers/scsi/sd.c~sd_revalidate_4k drivers/scsi/sd.c --- linux/drivers/scsi/sd.c~sd_revalidate_4k 2015-01-29 14:44:23.316171187 -0600 +++ linux-bjking1/drivers/scsi/sd.c 2015-01-29 14:51:05.846126392 -0600 @@ -2800,9 +2800,11 @@ static int sd_revalidate_disk(struct gen */ sd_set_flush_flag(sdkp); - max_xfer = min_not_zero(queue_max_hw_sectors(sdkp-disk-queue), - sdkp-max_xfer_blocks); + max_xfer = sdkp-max_xfer_blocks; max_xfer = ilog2(sdp-sector_size) - 9; + + max_xfer = min_not_zero(queue_max_hw_sectors(sdkp-disk-queue), + max_xfer); blk_queue_max_hw_sectors(sdkp-disk-queue, max_xfer); set_capacity(disk, sdkp-capacity); sd_config_write_same(sdkp); _ -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SCSI: SD: set max_ws_blocks as max_unmap_blocks if it isn't provided
On 05/12/2014 14:58, Martin K. Petersen wrote: Ming == Ming Lei ming@canonical.com writes: What about in READ CAPACITY(16)? Ming It isn't set too. Please try the following patch: [SCSI] sd: Tweak discard heuristics to work around QEMU SCSI issue 7985090aa020 changed the discard heuristics to give preference to the WRITE SAME commands that (unlike UNMAP) guarantee deterministic results. Ming Lei discovered that QEMU SCSI's WRITE SAME implementation internally relied on limits that were only communicated for the UNMAP case. And therefore discard commands backed by WRITE SAME would fail. Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only prefer the WRITE SAME variants if the device has the LBPRZ flag set. Reported-by: Ming Lei ming@canonical.com Signed-off-by: Martin K. Petersen martin.peter...@oracle.com diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 95bfb7bfbb9d..76c5d1388417 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -2623,8 +2623,9 @@ static void sd_read_block_limits(struct scsi_disk *sdkp) sd_config_discard(sdkp, SD_LBP_WS16); } else {/* LBP VPD page tells us what to use */ - - if (sdkp-lbpws) + if (sdkp-lbpu sdkp-max_unmap_blocks !sdkp-lbprz) + sd_config_discard(sdkp, SD_LBP_UNMAP); + else if (sdkp-lbpws) sd_config_discard(sdkp, SD_LBP_WS16); else if (sdkp-lbpws10) sd_config_discard(sdkp, SD_LBP_WS10); This is the right fix. Ming, how do you reproduce the QEMU bug? Acked-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SCSI: SD: set max_ws_blocks as max_unmap_blocks if it isn't provided
On 05/12/2014 14:05, Ming Lei wrote: [ 50.112885] sd 0:0:1:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 50.113859] sd 0:0:1:0: [sda] Sense Key : Illegal Request [current] [ 50.113859] sd 0:0:1:0: [sda] Add. Sense: Invalid field in cdb [ 50.113859] sd 0:0:1:0: [sda] CDB: [ 50.113859] Write same(16): 93 08 00 00 00 00 00 00 80 00 00 40 00 00 00 00 [ 50.113859] blk_update_request: critical target error, dev sda, sector 32768 So this command is zeroing 2^22 sectors (2GB) starting from sector 128. How did you run QEMU and what command produced this request? Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] libata: Whitelist SSDs that are known to properly return zeroes after TRIM
On 07/11/2014 06:08, Martin K. Petersen wrote: The whitelist is only meant as a starting point and is by no means comprehensive: - All intel SSD models except for 510 - Micron M5* - Samsung SSDs - Seagate SSDs Signed-off-by: Martin K. Petersen martin.peter...@oracle.com --- drivers/ata/libata-core.c | 18 ++ drivers/ata/libata-scsi.c | 10 ++ include/linux/libata.h| 1 + 3 files changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index c5ba15af87d3..f41f24a8bc21 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4225,10 +4225,20 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { PIONEER DVD-RW DVR-216D, NULL, ATA_HORKAGE_NOSETXFER }, /* devices that don't properly handle queued TRIM commands */ - { Micron_M500*, NULL, ATA_HORKAGE_NO_NCQ_TRIM, }, - { Crucial_CT???M500SSD*, NULL, ATA_HORKAGE_NO_NCQ_TRIM, }, - { Micron_M550*, NULL, ATA_HORKAGE_NO_NCQ_TRIM, }, - { Crucial_CT*M550SSD*,NULL, ATA_HORKAGE_NO_NCQ_TRIM, }, + { Micron_M5?0*, NULL, ATA_HORKAGE_NO_NCQ_TRIM | + ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { Crucial_CT???M5?0SSD*, NULL, ATA_HORKAGE_NO_NCQ_TRIM, }, I have a Crucial_CT256MX1 (i.e. MX100) and it does reliably zero. BTW. it's the same hardware as the M550, so probably the same set of quirks should apply to both. Paolo + + /* + * DRAT/RZAT are weak guarantees. Explicitly black/whitelist + * SSDs that provide reliable zero after TRIM. + */ + { INTEL*SSDSC2MH*,NULL, 0, }, /* Blacklist intel 510 */ + { INTEL*SSD*, NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { SSD*INTEL*, NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { Samsung*SSD*, NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { SAMSUNG*SSD*, NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { ST[1248][0248]0[FH]*, NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, /* * Some WD SATA-I drives spin up and down erratically when the link diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 0586f66d70fa..deaa6e34ed4d 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2515,13 +2515,15 @@ static unsigned int ata_scsiop_read_cap(struct ata_scsi_args *args, u8 *rbuf) rbuf[15] = lowest_aligned; if (ata_id_has_trim(args-id)) { - rbuf[14] |= 0x80; /* TPE */ + rbuf[14] |= 0x80; /* LBPME */ - if (ata_id_has_zero_after_trim(args-id)) - rbuf[14] |= 0x40; /* TPRZ */ + if (ata_id_has_zero_after_trim(args-id) + dev-horkage ATA_HORKAGE_ZERO_AFTER_TRIM) { + ata_dev_warn(dev, Enabling discard_zeroes_data\n); + rbuf[14] |= 0x40; /* LBPRZ */ + } } } - return 0; } diff --git a/include/linux/libata.h b/include/linux/libata.h index bd5fefeaf548..45ac825b8366 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -421,6 +421,7 @@ enum { ATA_HORKAGE_NO_NCQ_TRIM = (1 19),/* don't use queued TRIM */ ATA_HORKAGE_NOLPM = (1 20),/* don't use LPM */ ATA_HORKAGE_WD_BROKEN_LPM = (1 21), /* some WDs have broken LPM */ + ATA_HORKAGE_ZERO_AFTER_TRIM = (1 22),/* guarantees zero after trim */ /* DMA mask for user DMA control: User visible values; DO NOT renumber */ -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] esp_scsi: spellcheck 'driver'
On 21/11/2014 10:27, Hannes Reinecke wrote: Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/esp_scsi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h index cd68805..b5862e4 100644 --- a/drivers/scsi/esp_scsi.h +++ b/drivers/scsi/esp_scsi.h @@ -1,4 +1,4 @@ -/* esp_scsi.h: Defines and structures for the ESP drier. +/* esp_scsi.h: Defines and structures for the ESP driver. * * Copyright (C) 2007 David S. Miller (da...@davemloft.net) */ A good start. :) Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] esp_scsi: make number of tags configurable
On 21/11/2014 10:27, Hannes Reinecke wrote: Add a field 'num_tags' to the esp structure to allow drivers to overwrite the number of avialable tags if required. Default is ESP_DEFAULT_TAGS. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/esp_scsi.c | 10 -- drivers/scsi/esp_scsi.h | 3 +-- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 38c23e0..99cd42f 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -2317,6 +2317,8 @@ int scsi_esp_register(struct esp *esp, struct device *dev) static int instance; int err; + if (!esp-num_tags) + esp-num_tags = ESP_DEFAULT_TAGS; esp-host-transportt = esp_transport_template; esp-host-max_lun = ESP_MAX_LUN; esp-host-cmd_per_lun = 2; @@ -2403,12 +2405,8 @@ static int esp_slave_configure(struct scsi_device *dev) struct esp *esp = shost_priv(dev-host); struct esp_target_data *tp = esp-target[dev-id]; - if (dev-tagged_supported) { - /* XXX make this configurable somehow XXX */ - int goal_tags = min(ESP_DEFAULT_TAGS, ESP_MAX_TAG); min(16, 256) = 16 Might add a WARN_ON(esp-num_tags ESP_MAX_TAG) after you assign esp-num_tags. - scsi_adjust_queue_depth(dev, goal_tags); - } + if (dev-tagged_supported) + scsi_adjust_queue_depth(dev, esp-num_tags); tp-flags |= ESP_TGT_DISCONNECT; diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h index b5862e4..975d293 100644 --- a/drivers/scsi/esp_scsi.h +++ b/drivers/scsi/esp_scsi.h @@ -283,7 +283,6 @@ struct esp_cmd_entry { struct completion *eh_done; }; -/* XXX make this configurable somehow XXX */ #define ESP_DEFAULT_TAGS 16 #define ESP_MAX_TARGET 16 @@ -445,7 +444,7 @@ struct esp { u8 prev_soff; u8 prev_stp; u8 prev_cfg3; - u8 __pad; + u8 num_tags; struct list_headesp_cmd_pool; Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/10] esp_scsi: debug event and command
On 21/11/2014 10:27, Hannes Reinecke wrote: Add new debug definitions for event and command logging. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/esp_scsi.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 492c51b..fe3278e 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -49,6 +49,8 @@ static u32 esp_debug; #define ESP_DEBUG_DATADONE 0x0100 #define ESP_DEBUG_RECONNECT 0x0200 #define ESP_DEBUG_AUTOSENSE 0x0400 +#define ESP_DEBUG_EVENT 0x0800 +#define ESP_DEBUG_COMMAND0x1000 #define esp_log_intr(f, a...) \ do { if (esp_debug ESP_DEBUG_INTR) \ @@ -100,6 +102,16 @@ do { if (esp_debug ESP_DEBUG_AUTOSENSE) \ shost_printk(KERN_DEBUG, esp-host, f, ## a); \ } while (0) +#define esp_log_event(f, a...) \ +do { if (esp_debug ESP_DEBUG_EVENT) \ + shost_printk(KERN_DEBUG, esp-host, f, ## a); \ +} while (0) + +#define esp_log_command(f, a...) \ +do { if (esp_debug ESP_DEBUG_COMMAND)\ + shost_printk(KERN_DEBUG, esp-host, f, ## a); \ +} while (0) + #define esp_read8(REG) esp-ops-esp_read8(esp, REG) #define esp_write8(VAL,REG) esp-ops-esp_write8(esp, VAL, REG) @@ -126,6 +138,7 @@ void scsi_esp_cmd(struct esp *esp, u8 val) esp-esp_event_cur = (idx + 1) (ESP_EVENT_LOG_SZ - 1); + esp_log_command(cmd[%02x]\n, val); esp_write8(val, ESP_CMD); } EXPORT_SYMBOL(scsi_esp_cmd); @@ -1638,6 +1651,8 @@ static int esp_process_event(struct esp *esp) again: write = 0; + esp_log_event(process event %d phase %x\n, + esp-event, esp-sreg ESP_STAT_PMASK); switch (esp-event) { case ESP_EVENT_CHECK_PHASE: switch (esp-sreg ESP_STAT_PMASK) { Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/10] esp_scsi: convert to dev_printk
, esp-dma_regs, +esp-host-irq); + dev_printk(KERN_INFO, dev, +esp%u: is a %s, %u MHz (ccf=%u), SCSI ID %u\n, +esp-host-unique_id, esp_chip_names[esp-rev], +esp-cfreq / 100, esp-cfact, esp-scsi_id); /* Let the SCSI bus reset settle. */ ssleep(esp_bus_reset_settle); @@ -2435,19 +2434,20 @@ static int esp_eh_abort_handler(struct scsi_cmnd *cmd) * XXX much for the final driver. */ spin_lock_irqsave(esp-host-host_lock, flags); - printk(KERN_ERR PFX esp%d: Aborting command [%p:%02x]\n, -esp-host-unique_id, cmd, cmd-cmnd[0]); + shost_printk(KERN_ERR, esp-host, Aborting command [%p:%02x]\n, + cmd, cmd-cmnd[0]); ent = esp-active_cmd; if (ent) - printk(KERN_ERR PFX esp%d: Current command [%p:%02x]\n, -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]); + shost_printk(KERN_ERR, esp-host, + Current command [%p:%02x]\n, + ent-cmd, ent-cmd-cmnd[0]); list_for_each_entry(ent, esp-queued_cmds, list) { - printk(KERN_ERR PFX esp%d: Queued command [%p:%02x]\n, -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]); + shost_printk(KERN_ERR, esp-host, Queued command [%p:%02x]\n, + ent-cmd, ent-cmd-cmnd[0]); } list_for_each_entry(ent, esp-active_cmds, list) { - printk(KERN_ERR PFX esp%d: Active command [%p:%02x]\n, -esp-host-unique_id, ent-cmd, ent-cmd-cmnd[0]); + shost_printk(KERN_ERR, esp-host, Active command [%p:%02x]\n, + ent-cmd, ent-cmd-cmnd[0]); } esp_dump_cmd_log(esp); spin_unlock_irqrestore(esp-host-host_lock, flags); Looks good. Reviewed-by: Paolo Bonzini pbonz...@redhat.com Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] esp_scsi: read status registers
On 21/11/2014 10:27, Hannes Reinecke wrote: A read to ESP_INTRPT will clear ESP_STATUS and ESP_SSTEP. So read all status registers in one go to avoid losing information. (ESP_STAT_TCNT is actually kept in the status register, it is cleared by writing TCLO/MID/HI). Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/esp_scsi.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index fe3278e..92ab921 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -982,7 +982,6 @@ static int esp_check_spur_intr(struct esp *esp) default: if (!(esp-sreg ESP_STAT_INTR)) { - esp-ireg = esp_read8(ESP_INTRPT); if (esp-ireg ESP_INTR_SR) return 1; @@ -2056,7 +2055,12 @@ static void __esp_interrupt(struct esp *esp) int finish_reset, intr_done; u8 phase; + /* + * Once INTRPT is read STATUS and SSTEP are cleared. + */ esp-sreg = esp_read8(ESP_STATUS); + esp-seqreg = esp_read8(ESP_SSTEP); + esp-ireg = esp_read8(ESP_INTRPT); if (esp-flags ESP_FLAG_RESETTING) { finish_reset = 1; @@ -2069,8 +2073,6 @@ static void __esp_interrupt(struct esp *esp) return; } - esp-ireg = esp_read8(ESP_INTRPT); - if (esp-ireg ESP_INTR_SR) finish_reset = 1; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/10] esp: Use FIFO for command submission
On 21/11/2014 10:27, Hannes Reinecke wrote: The am53c974 has a design flaw causing it to throw an DMA interrupt whenever a DMA transmission completed, even though DMA interrupt reporting is disabled. This confuses the esp routines as it would register a DMA interrupt whenever a cdb has been transmitted to the drive. So implement an option to use the FIFO for command submission and enable it for am53c974. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/am53c974.c | 1 + drivers/scsi/esp_scsi.c | 83 ++--- drivers/scsi/esp_scsi.h | 1 + 3 files changed, 66 insertions(+), 19 deletions(-) diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c index b9af8b0..652762e 100644 --- a/drivers/scsi/am53c974.c +++ b/drivers/scsi/am53c974.c @@ -401,6 +401,7 @@ static int pci_esp_probe_one(struct pci_dev *pdev, esp-host = shost; esp-dev = pdev; esp-ops = pci_esp_ops; + esp-flags |= ESP_FLAG_USE_FIFO; Why not invert this patch and the previous one, and add this line directly when the am53c974 driver is born? pep-esp = esp; if (pci_request_regions(pdev, DRV_MODULE_NAME)) { diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 92ab921..ab7d2bc 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -605,7 +605,7 @@ static void esp_autosense(struct esp *esp, struct esp_cmd_entry *ent) { struct scsi_cmnd *cmd = ent-cmd; struct scsi_device *dev = cmd-device; - int tgt, lun; + int tgt, lun, i; u8 *p, val; tgt = dev-id; @@ -626,7 +626,10 @@ static void esp_autosense(struct esp *esp, struct esp_cmd_entry *ent) esp-active_cmd = ent; - p = esp-command_block; + if (esp-flags ESP_FLAG_USE_FIFO) + p = esp-fifo; + else + p = esp-command_block; Any reason not to use esp-command_block unconditionally? esp-msg_out_len = 0; *p++ = IDENTIFY(0, lun); @@ -648,12 +651,21 @@ static void esp_autosense(struct esp *esp, struct esp_cmd_entry *ent) esp_write_tgt_sync(esp, tgt); esp_write_tgt_config3(esp, tgt); - val = (p - esp-command_block); + if (esp-flags ESP_FLAG_USE_FIFO) + val = (p - esp-fifo); + else + val = (p - esp-command_block); if (esp-rev == FASHME) scsi_esp_cmd(esp, ESP_CMD_FLUSH); For consistency with what you do elsewhere, please move this in the else. - esp-ops-send_dma_cmd(esp, esp-command_block_dma, -val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA); + if (esp-flags ESP_FLAG_USE_FIFO) { + scsi_esp_cmd(esp, ESP_CMD_FLUSH); + for (i = 0; i val; i++) + esp_write8(esp-fifo[i], ESP_FDATA); + scsi_esp_cmd(esp, ESP_CMD_SELA); + } else + esp-ops-send_dma_cmd(esp, esp-command_block_dma, +val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA); } Hmm, what about a wrapper __send_cmd(esp, data, esp_count, dma_count, cmd, force_flush, force_fifo) { use_fifo = force_fifo || (esp-flags ESP_FLAG_USE_FIFO) || esp_count == 1; if (force_flush || use_fifo || esp-rev == FASHME) scsi_esp_cmd(esp, ESP_CMD_FLUSH); if (use_fifo) { for (i = 0; i val; i++) esp_write8(esp-fifo[i], ESP_FDATA); scsi_esp_cmd(esp, cmd); } else { if (data != esp-command_block) memcpy(esp-command_block, data, length) esp-ops-send_dma_cmd(esp, esp-command_block_dma, esp_count, dma_count, 0, cmd | ESP_CMD_DMA); } } send_cmd(esp, data, esp_count, dma_count, cmd) { __send_cmd(esp, data, esp_count, dma_count, cmd, false, false); } This would be: send_cmd(esp, esp-command_block, val, 16, ESP_CMD_SELA); static struct esp_cmd_entry *find_and_prep_issuable_command(struct esp *esp) @@ -727,7 +739,10 @@ static void esp_maybe_execute_command(struct esp *esp) esp_check_command_len(esp, cmd); - p = esp-command_block; + if (esp-flags ESP_FLAG_USE_FIFO) + p = esp-fifo; Any reason not to use esp-command_block unconditionally? + else + p = esp-command_block; esp-msg_out_len = 0; if (tp-flags ESP_TGT_CHECK_NEGO) { @@ -789,13 +804,15 @@ build_identify: } if (!(esp-flags ESP_FLAG_DOING_SLOWCMD)) { - start_cmd = ESP_CMD_DMA | ESP_CMD_SELA; + start_cmd = ESP_CMD_SELA; if (ent-tag[0]) { *p++ = ent-tag[0]; *p++ = ent-tag[1]; - start_cmd = ESP_CMD_DMA | ESP_CMD_SA3; +
Re: [PATCH 10/10] esp: enable CONFIG2_FENAB for am53c974
On 21/11/2014 10:27, Hannes Reinecke wrote: CONFIG2_FENAB ('feature enable') changed definition between chip revisions, from 'Latch SCSI Phase' to 'Latch SCSI Phase, display chip ID upon reset, and enable 24 bit addresses'. So only enable it for am53c974 where we know what it's doing. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/am53c974.c | 30 ++ drivers/scsi/esp_scsi.c | 4 2 files changed, 34 insertions(+) diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c index 0452ed1..722e781 100644 --- a/drivers/scsi/am53c974.c +++ b/drivers/scsi/am53c974.c @@ -252,6 +252,8 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 addr, u32 esp_count, pci_esp_write8(esp, (esp_count 0) 0xff, ESP_TCLOW); pci_esp_write8(esp, (esp_count 8) 0xff, ESP_TCMED); + if (esp-config2 ESP_CONFIG2_FENAB) + pci_esp_write8(esp, (esp_count 16) 0xff, ESP_TCHI); Why do this conditionally? We know that FENAB is true here, don't we? (Maybe I'm missing something obvious though). Paolo pci_esp_write32(esp, esp_count, ESP_DMA_STC); pci_esp_write32(esp, addr, ESP_DMA_SPA); @@ -265,6 +267,33 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 addr, u32 esp_count, pci_esp_write8(esp, ESP_DMA_CMD_START | val, ESP_DMA_CMD); } +static u32 pci_esp_dma_length_limit(struct esp *esp, u32 dma_addr, u32 dma_len) +{ + int dma_limit = 16; + u32 base, end; + + /* + * If CONFIG2_FENAB is set we can + * handle up to 24 bit addresses + */ + if (esp-config2 ESP_CONFIG2_FENAB) + dma_limit = 24; + + if (dma_len (1U dma_limit)) + dma_len = (1U dma_limit); + + /* + * Prevent crossing a 24-bit address boundary. + */ + base = dma_addr ((1U 24) - 1U); + end = base + dma_len; + if (end (1U 24)) + end = (1U 24); + dma_len = end - base; + + return dma_len; +} + static const struct esp_driver_ops pci_esp_ops = { .esp_write8 = pci_esp_write8, .esp_read8 = pci_esp_read8, @@ -278,6 +307,7 @@ static const struct esp_driver_ops pci_esp_ops = { .dma_invalidate = pci_esp_dma_invalidate, .send_dma_cmd = pci_esp_send_dma_cmd, .dma_error = pci_esp_dma_error, + .dma_length_limit = pci_esp_dma_length_limit, }; /* diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 01753f5..d0b7b32 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -289,6 +289,8 @@ static void esp_reset_esp(struct esp *esp) case FAS236: case PCSCSI: + if (esp-rev == PCSCSI) + esp-config2 |= ESP_CONFIG2_FENAB; /* Fast 236, AM53c974 or HME */ esp_write8(esp-config2, ESP_CFG2); if (esp-rev == FASHME) { @@ -1362,6 +1364,8 @@ static int esp_data_bytes_sent(struct esp *esp, struct esp_cmd_entry *ent, (((unsigned int)esp_read8(ESP_TCMED)) 8)); if (esp-rev == FASHME) ecount |= ((unsigned int)esp_read8(FAS_RLO)) 16; + if (esp-rev == PCSCSI (esp-config2 ESP_CONFIG2_FENAB)) + ecount |= ((unsigned int)esp_read8(ESP_TCHI)) 16; } bytes_sent = esp-data_dma_len; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/10] am53c974: BLAST residual handling
On 21/11/2014 10:27, Hannes Reinecke wrote: The am53c974 has an design issue where a single byte might be left in the SCSI FIFO after a DMA transfer. As the handling code is currently untested add a WARN_ON() statement here. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/am53c974.c | 6 ++ drivers/scsi/esp_scsi.c | 29 + drivers/scsi/esp_scsi.h | 1 + 3 files changed, 36 insertions(+) diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c index 652762e..0b7643e 100644 --- a/drivers/scsi/am53c974.c +++ b/drivers/scsi/am53c974.c @@ -195,6 +195,12 @@ static void pci_esp_dma_drain(struct esp *esp) shost_printk(KERN_INFO, esp-host, DMA blast done (%d tries, %d bytes left)\n, lim, resid); #endif + /* BLAST residual handling is currently untested */ + if (WARN_ON(resid == 1)) { WARN_ON_ONCE? Paolo + struct esp_cmd_entry *ent = esp-active_cmd; + + ent-flags |= ESP_CMD_FLAG_RESIDUAL; + } } static void pci_esp_dma_invalidate(struct esp *esp) diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index ab7d2bc..88272bb 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -1353,6 +1353,35 @@ static int esp_data_bytes_sent(struct esp *esp, struct esp_cmd_entry *ent, bytes_sent = esp-data_dma_len; bytes_sent -= ecount; + /* + * The am53c974 has a DMA 'pecularity'. The doc states: + * In some odd byte conditions, one residual byte will + * be left in the SCSI FIFO, and the FIFO Flags will + * never count to '0 '. When this happens, the residual + * byte should be retrieved via PIO following completion + * of the BLAST operation. + */ + if (fifo_cnt == 1 ent-flags ESP_CMD_FLAG_RESIDUAL) { + size_t count = 1; + size_t offset = bytes_sent; + u8 bval = esp_read8(ESP_FDATA); + + if (ent-flags ESP_CMD_FLAG_AUTOSENSE) + ent-sense_ptr[bytes_sent] = bval; + else { + struct esp_cmd_priv *p = ESP_CMD_PRIV(cmd); + u8 *ptr; + + ptr = scsi_kmap_atomic_sg(p-cur_sg, p-u.num_sg, + offset, count); + if (likely(ptr)) { + *(ptr + offset) = bval; + scsi_kunmap_atomic_sg(ptr); + } + } + bytes_sent += fifo_cnt; + ent-flags = ~ESP_CMD_FLAG_RESIDUAL; + } if (!(ent-flags ESP_CMD_FLAG_WRITE)) bytes_sent -= fifo_cnt; diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h index 27dcaf8..5fa456c 100644 --- a/drivers/scsi/esp_scsi.h +++ b/drivers/scsi/esp_scsi.h @@ -269,6 +269,7 @@ struct esp_cmd_entry { #define ESP_CMD_FLAG_WRITE 0x01 /* DMA is a write */ #define ESP_CMD_FLAG_ABORT 0x02 /* being aborted */ #define ESP_CMD_FLAG_AUTOSENSE 0x04 /* Doing automatic REQUEST_SENSE */ +#define ESP_CMD_FLAG_RESIDUAL0x08 /* AM53c974 BLAST residual */ u8 tag[2]; u8 orig_tag[2]; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] esp: correctly detect am53c974
On 21/11/2014 10:27, Hannes Reinecke wrote: The am53c974 returns the same ID as the FAS236, but implements things slightly differently. So detect the am53c974 by checking for ESP_CONFIG4 register. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/am53c974.c | 2 ++ drivers/scsi/esp_scsi.c | 17 - drivers/scsi/esp_scsi.h | 15 +++ 3 files changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c index 0b7643e..0452ed1 100644 --- a/drivers/scsi/am53c974.c +++ b/drivers/scsi/am53c974.c @@ -365,6 +365,8 @@ static void dc390_check_eeprom(struct esp *esp) } esp-scsi_id = EEbuf[DC390_EE_ADAPT_SCSI_ID]; esp-num_tags = 2 EEbuf[DC390_EE_TAG_CMD_NUM]; + if (EEbuf[DC390_EE_MODE2] DC390_EE_MODE2_ACTIVE_NEGATION) + esp-config4 |= ESP_CONFIG4_RADE | ESP_CONFIG4_RAE; } static int pci_esp_probe_one(struct pci_dev *pdev, diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 88272bb..01753f5 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -250,6 +250,19 @@ static void esp_reset_esp(struct esp *esp) } else { esp-min_period = ((5 * esp-ccycle) / 1000); } + if (esp-rev == FAS236) { + /* + * The AM53c974 chip returns the same ID as FAS236; + * try to configure glitch eater. + */ + u8 config4 = ESP_CONFIG4_GE1; + esp_write8(config4, ESP_CFG4); + config4 = esp_read8(ESP_CFG4); + if ((config4 ESP_CONFIG4_GE1) == ESP_CONFIG4_GE1) { + esp-rev = PCSCSI; + esp_write8(esp-config4, ESP_CFG4); + } + } esp-max_period = (esp-max_period + 3)2; esp-min_period = (esp-min_period + 3)2; @@ -275,7 +288,8 @@ static void esp_reset_esp(struct esp *esp) /* fallthrough... */ case FAS236: - /* Fast 236 or HME */ + case PCSCSI: + /* Fast 236, AM53c974 or HME */ esp_write8(esp-config2, ESP_CFG2); if (esp-rev == FASHME) { u8 cfg3 = esp-target[0].esp_config3; @@ -2397,6 +2411,7 @@ static const char *esp_chip_names[] = { FAS100A, FAST, FASHME, + AM53C974, }; static struct scsi_transport_template *esp_transport_template; diff --git a/drivers/scsi/esp_scsi.h b/drivers/scsi/esp_scsi.h index 5fa456c..84dcbe4 100644 --- a/drivers/scsi/esp_scsi.h +++ b/drivers/scsi/esp_scsi.h @@ -25,6 +25,7 @@ #define ESP_CTEST0x0aUL /* wo Chip test register 0x28 */ #define ESP_CFG2 0x0bUL /* rw Second cfg register 0x2c */ #define ESP_CFG3 0x0cUL /* rw Third cfg register 0x30 */ +#define ESP_CFG4 0x0dUL /* rw Fourth cfg register 0x34 */ #define ESP_TCHI 0x0eUL /* rw High bits transf count 0x38 */ #define ESP_UID ESP_TCHI/* ro Unique ID code 0x38 */ #define FAS_RLO ESP_TCHI/* rw HME extended counter 0x38 */ @@ -76,6 +77,18 @@ #define ESP_CONFIG3_IMS 0x80 /* ID msg chk'ng(esp/fas236) */ #define ESP_CONFIG3_OBPUSH0x80 /* Push odd-byte to dma (hme) */ +/* ESP config register 4 read-write, found only on am53c974 chips */ +#define ESP_CONFIG4_RADE 0x04 /* Active negation */ +#define ESP_CONFIG4_RAE 0x08 /* Active negation on REQ and ACK */ +#define ESP_CONFIG4_PWD 0x20 /* Reduced power feature */ +#define ESP_CONFIG4_GE0 0x40 /* Glitch eater bit 0 */ +#define ESP_CONFIG4_GE1 0x80 /* Glitch eater bit 1 */ + +#define ESP_CONFIG_GE_12NS(0) +#define ESP_CONFIG_GE_25NS(ESP_CONFIG_GE1) +#define ESP_CONFIG_GE_35NS(ESP_CONFIG_GE0) +#define ESP_CONFIG_GE_0NS (ESP_CONFIG_GE0 | ESP_CONFIG_GE1) + /* ESP command register read-write */ /* Group 1 commands: These may be sent at any point in time to the ESP *chip. None of them can generate interrupts 'cept @@ -254,6 +267,7 @@ enum esp_rev { FAS100A= 0x04, FAST = 0x05, FASHME = 0x06, + PCSCSI = 0x07, /* AM53c974 */ }; struct esp_cmd_entry { @@ -466,6 +480,7 @@ struct esp { u8 bursts; u8 config1; u8 config2; + u8 config4; u8 scsi_id; u32 scsi_id_mask; Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/10] scsi: add 'am53c974' driver
On 21/11/2014 10:27, Hannes Reinecke wrote: This patch adds a new implementation for the Tekram DC-390T / AMD AM53c974 SCSI controller, based on the generic esp_scsi infrastructure. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/Kconfig| 18 ++ drivers/scsi/Makefile | 1 + drivers/scsi/am53c974.c | 523 3 files changed, 542 insertions(+) create mode 100644 drivers/scsi/am53c974.c diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 21bc674..497e7d5 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -1357,6 +1357,24 @@ config SCSI_DC390T To compile this driver as a module, choose M here: the module will be called tmscsim. +config SCSI_AM53C974 + tristate Tekram DC390(T) and Am53/79C974 SCSI support (new driver) + depends on PCI SCSI Perhaps make it a choice with SCSI_DC390, since they match on the same PCI vendor/device IDs? Paolo + select SCSI_SPI_ATTRS + ---help--- + This driver supports PCI SCSI host adapters based on the Am53C974A + chip, e.g. Tekram DC390(T), DawiControl 2974 and some onboard + PCscsi/PCnet (Am53/79C974) solutions. + This is a new implementation base on the generic esp_scsi driver. + + Documentation can be found in file:Documentation/scsi/tmscsim.txt. + + Note that this driver does NOT support Tekram DC390W/U/F, which are + based on NCR/Symbios chips. Use NCR53C8XX SCSI support for those. + + To compile this driver as a module, choose M here: the + module will be called am53c974. + config SCSI_T128 tristate Trantor T128/T128F/T228 SCSI support depends on ISA SCSI diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index 76f8932..7f974fc 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -101,6 +101,7 @@ obj-$(CONFIG_SCSI_7000FASST) += wd7000.o obj-$(CONFIG_SCSI_EATA) += eata.o obj-$(CONFIG_SCSI_DC395x)+= dc395x.o obj-$(CONFIG_SCSI_DC390T)+= tmscsim.o +obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o obj-$(CONFIG_MEGARAID_LEGACY)+= megaraid.o obj-$(CONFIG_MEGARAID_NEWGEN)+= megaraid/ obj-$(CONFIG_MEGARAID_SAS) += megaraid/ diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c new file mode 100644 index 000..b9af8b0 --- /dev/null +++ b/drivers/scsi/am53c974.c @@ -0,0 +1,523 @@ +/* + * AMD am53c974 driver. + * Copyright (c) 2014 Hannes Reinecke, SUSE Linux GmbH + */ + +#include linux/kernel.h +#include linux/module.h +#include linux/init.h +#include linux/delay.h +#include linux/pci.h +#include linux/interrupt.h + +#include scsi/scsi_host.h + +#include esp_scsi.h + +#define DRV_MODULE_NAME am53c974 +#define DRV_MODULE_VERSION 1.00 + +// #define ESP_DMA_DEBUG + +#define ESP_DMA_CMD 0x10 +#define ESP_DMA_STC 0x11 +#define ESP_DMA_SPA 0x12 +#define ESP_DMA_WBC 0x13 +#define ESP_DMA_WAC 0x14 +#define ESP_DMA_STATUS 0x15 +#define ESP_DMA_SMDLA 0x16 +#define ESP_DMA_WMAC 0x17 + +#define ESP_DMA_CMD_IDLE 0x00 +#define ESP_DMA_CMD_BLAST 0x01 +#define ESP_DMA_CMD_ABORT 0x02 +#define ESP_DMA_CMD_START 0x03 +#define ESP_DMA_CMD_MASK 0x03 +#define ESP_DMA_CMD_DIAG 0x04 +#define ESP_DMA_CMD_MDL 0x10 +#define ESP_DMA_CMD_INTE_P 0x20 +#define ESP_DMA_CMD_INTE_D 0x40 +#define ESP_DMA_CMD_DIR 0x80 + +#define ESP_DMA_STAT_PWDN 0x01 +#define ESP_DMA_STAT_ERROR 0x02 +#define ESP_DMA_STAT_ABORT 0x04 +#define ESP_DMA_STAT_DONE 0x08 +#define ESP_DMA_STAT_SCSIINT 0x10 +#define ESP_DMA_STAT_BCMPLT 0x20 + +/* EEPROM is accessed with 16-bit values */ +#define DC390_EEPROM_READ 0x80 +#define DC390_EEPROM_LEN 0x40 + +/* + * DC390 EEPROM + * + * 8 * 4 bytes of per-device options + * followed by HBA specific options + */ + +/* Per-device options */ +#define DC390_EE_MODE1 0x00 +#define DC390_EE_SPEED 0x01 + +/* HBA-specific options */ +#define DC390_EE_ADAPT_SCSI_ID 0x40 +#define DC390_EE_MODE2 0x41 +#define DC390_EE_DELAY 0x42 +#define DC390_EE_TAG_CMD_NUM 0x43 + +#define DC390_EE_MODE1_PARITY_CHK 0x01 +#define DC390_EE_MODE1_SYNC_NEGO0x02 +#define DC390_EE_MODE1_EN_DISC 0x04 +#define DC390_EE_MODE1_SEND_START 0x08 +#define DC390_EE_MODE1_TCQ 0x10 + +#define DC390_EE_MODE2_MORE_2DRV0x01 +#define DC390_EE_MODE2_GREATER_1G 0x02 +#define DC390_EE_MODE2_RST_SCSI_BUS 0x04 +#define DC390_EE_MODE2_ACTIVE_NEGATION 0x08 +#define DC390_EE_MODE2_NO_SEEK 0x10 +#define DC390_EE_MODE2_LUN_CHECK0x20 + +struct pci_esp_priv { + struct esp *esp; + u8 dma_status; +}; + +#define PCI_ESP_GET_PRIV(esp) ((struct pci_esp_priv *) \ +pci_get_drvdata((struct pci_dev *) \ +(esp)-dev)) + +static void pci_esp_dma_drain(struct esp *esp); + +static void pci_esp_write8(struct esp *esp, u8
Re: [PATCH 06/10] scsi: add 'am53c974' driver
Oops, hit send too early. A small nit: On 21/11/2014 10:27, Hannes Reinecke wrote: +static void pci_esp_dma_drain(struct esp *esp) +{ + u8 resid; + int lim = 1000; + + + if ((esp-sreg ESP_STAT_PMASK) == ESP_DOP || + (esp-sreg ESP_STAT_PMASK) == ESP_DIP) + /* Data-In or Data-Out, nothing to be done */ + return; + + while (--lim 0) { + resid = pci_esp_read8(esp, ESP_FFLAGS) ESP_FF_FBYTES; + if (resid = 1) + break; cpu_relax()? + } + if (resid 1) { + /* FIFO not cleared */ + shost_printk(KERN_INFO, esp-host, + FIFO not cleared, %d bytes left\n, + resid); + } + + /* + * When there is a residual BCMPLT will never be set + * (obviously). But we still have to issue the BLAST + * command, otherwise the data will not being transferred. + * But we'll never know when the BLAST operation is + * finished. So check for some time and give up eventually. + */ + lim = 1000; + pci_esp_write8(esp, ESP_DMA_CMD_DIR | ESP_DMA_CMD_BLAST, ESP_DMA_CMD); + while (pci_esp_read8(esp, ESP_DMA_STATUS) ESP_DMA_STAT_BCMPLT) { + if (--lim == 0) + break; cpu_relax()? Otherwise looks sane. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/10] esp: Use FIFO for command submission
On 21/11/2014 11:38, Hannes Reinecke wrote: esp-msg_out_len = 0; *p++ = IDENTIFY(0, lun); @@ -648,12 +651,21 @@ static void esp_autosense(struct esp *esp, struct esp_cmd_entry *ent) esp_write_tgt_sync(esp, tgt); esp_write_tgt_config3(esp, tgt); - val = (p - esp-command_block); + if (esp-flags ESP_FLAG_USE_FIFO) + val = (p - esp-fifo); + else + val = (p - esp-command_block); if (esp-rev == FASHME) scsi_esp_cmd(esp, ESP_CMD_FLUSH); For consistency with what you do elsewhere, please move this in the else. No. The 'USE_FIFO' mechanism is a general feature which _should_ work on all chips. The above 'if' condition is a chipset-specific feature which should be treated separately. Yup, but USE_FIFO always sends a flush anyway. Sending two is useless. - esp-ops-send_dma_cmd(esp, esp-command_block_dma, - val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA); + if (esp-flags ESP_FLAG_USE_FIFO) { + scsi_esp_cmd(esp, ESP_CMD_FLUSH); + for (i = 0; i val; i++) + esp_write8(esp-fifo[i], ESP_FDATA); + scsi_esp_cmd(esp, ESP_CMD_SELA); + } else + esp-ops-send_dma_cmd(esp, esp-command_block_dma, + val, 16, 0, ESP_CMD_DMA | ESP_CMD_SELA); } Hmm, what about a wrapper __send_cmd(esp, data, esp_count, dma_count, cmd, force_flush, force_fifo) { use_fifo = force_fifo || (esp-flags ESP_FLAG_USE_FIFO) || esp_count == 1; if (force_flush || use_fifo || esp-rev == FASHME) scsi_esp_cmd(esp, ESP_CMD_FLUSH); if (use_fifo) { for (i = 0; i val; i++) esp_write8(esp-fifo[i], ESP_FDATA); scsi_esp_cmd(esp, cmd); } else { if (data != esp-command_block) memcpy(esp-command_block, data, length) esp-ops-send_dma_cmd(esp, esp-command_block_dma, esp_count, dma_count, 0, cmd | ESP_CMD_DMA); } } send_cmd(esp, data, esp_count, dma_count, cmd) { __send_cmd(esp, data, esp_count, dma_count, cmd, false, false); } This would be: send_cmd(esp, esp-command_block, val, 16, ESP_CMD_SELA); Good point. (Note this was also why I was suggesting just using esp-command_block unconditionally). Paolo Will be updating the patch. Cheers, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] esp: enable CONFIG2_FENAB for am53c974
On 21/11/2014 11:22, Hannes Reinecke wrote: On 11/21/2014 11:08 AM, Paolo Bonzini wrote: On 21/11/2014 10:27, Hannes Reinecke wrote: CONFIG2_FENAB ('feature enable') changed definition between chip revisions, from 'Latch SCSI Phase' to 'Latch SCSI Phase, display chip ID upon reset, and enable 24 bit addresses'. So only enable it for am53c974 where we know what it's doing. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/am53c974.c | 30 ++ drivers/scsi/esp_scsi.c | 4 2 files changed, 34 insertions(+) diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c index 0452ed1..722e781 100644 --- a/drivers/scsi/am53c974.c +++ b/drivers/scsi/am53c974.c @@ -252,6 +252,8 @@ static void pci_esp_send_dma_cmd(struct esp *esp, u32 addr, u32 esp_count, pci_esp_write8(esp, (esp_count 0) 0xff, ESP_TCLOW); pci_esp_write8(esp, (esp_count 8) 0xff, ESP_TCMED); + if (esp-config2 ESP_CONFIG2_FENAB) + pci_esp_write8(esp, (esp_count 16) 0xff, ESP_TCHI); Why do this conditionally? We know that FENAB is true here, don't we? (Maybe I'm missing something obvious though). Not really. Point is that 'FENAB' does actually three things: - Enable TCHI for 24-bit DMA transfer lengths - Provide Chip ID in TCHI after reset - Latch SCSI phase after completion in SCSI STATUS So we _might_ run into timing issues due to the last point, so I've made it conditional in case we'd have to disable it. Maybe make it a module parameter maybe? Something like this lets you set esp-config2 from the driver. Look at it with -b to avoid insanity, it changes half a dozen line modulo the reindendation. -- 8 From: Paolo Bonzini pbonz...@redhat.com Subject: [PATCH] esp_scsi: let DMA driver provide a config2 value On PCscsi, the FENAB configuration also enables 24-bit DMA transfer lengths (and provides the chip id in TCHI after reset). We want this to be available as a module parameter. Check if the caller of scsi_esp_register provided a value for esp-config2. If this is the case, assume this is not an ESP100, skip the detection phase and leave esp-config2 untouched. It will be used in esp_reset_esp. Signed-off-by: Paolo Bonzini pbonz...@redhat.com diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index 55548dc5cec3..123edcf439c0 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -2149,47 +2149,50 @@ static void esp_get_revision(struct esp *esp) u8 val; esp-config1 = (ESP_CONFIG1_PENABLE | (esp-scsi_id 7)); - esp-config2 = (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY); + if (esp-config2 == 0) { + esp-config2 = (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY); + esp_write8(esp-config2, ESP_CFG2); + + val = esp_read8(ESP_CFG2); + val = ~ESP_CONFIG2_MAGIC; + + esp-config2 = 0; + if (val != (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY)) { + /* If what we write to cfg2 does not come back, cfg2 is not +* implemented, therefore this must be a plain esp100. +*/ + esp-rev = ESP100; + return; + } + } + + esp_set_all_config3(esp, 5); + esp-prev_cfg3 = 5; esp_write8(esp-config2, ESP_CFG2); + esp_write8(0, ESP_CFG3); + esp_write8(esp-prev_cfg3, ESP_CFG3); - val = esp_read8(ESP_CFG2); - val = ~ESP_CONFIG2_MAGIC; - if (val != (ESP_CONFIG2_SCSI2ENAB | ESP_CONFIG2_REGPARITY)) { - /* If what we write to cfg2 does not come back, cfg2 is not -* implemented, therefore this must be a plain esp100. + val = esp_read8(ESP_CFG3); + if (val != 5) { + /* The cfg2 register is implemented, however +* cfg3 is not, must be esp100a. */ - esp-rev = ESP100; + esp-rev = ESP100A; } else { - esp-config2 = 0; - esp_set_all_config3(esp, 5); - esp-prev_cfg3 = 5; - esp_write8(esp-config2, ESP_CFG2); - esp_write8(0, ESP_CFG3); + esp_set_all_config3(esp, 0); + esp-prev_cfg3 = 0; esp_write8(esp-prev_cfg3, ESP_CFG3); - val = esp_read8(ESP_CFG3); - if (val != 5) { - /* The cfg2 register is implemented, however -* cfg3 is not, must be esp100a. -*/ - esp-rev = ESP100A; + /* All of cfg{1,2,3} implemented, must be one of +* the fas variants, figure out which one. +*/ + if (esp-cfact == 0 || esp-cfact ESP_CCF_F5) { + esp-rev = FAST; + esp-sync_defp = SYNC_DEFP_FAST
Re: [PATCH 06/12] esp_scsi: use FIFO for command submission
ESP_FLAG_QUICKIRQ_CHECK 0x0010 #define ESP_FLAG_DISABLE_SYNC0x0020 +#define ESP_FLAG_USE_FIFO0x0040 u8 select_state; #define ESP_SELECT_NONE 0x00 /* Not selecting */ Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1] virtio_scsi: support multi hw queue of blk-mq
On 15/11/2014 04:47, Ming Lei wrote: Since virtio_scsi has supported multi virtqueue already, it is natural to map the virtque to hw-queue of blk-mq. Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Ming Lei ming@canonical.com --- V1: - support non-mq too drivers/scsi/virtio_scsi.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index b83846f..d3af70e 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -561,6 +561,15 @@ static int virtscsi_queuecommand_single(struct Scsi_Host *sh, return virtscsi_queuecommand(vscsi, vscsi-req_vqs[0], sc); } +static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi, + struct scsi_cmnd *sc) +{ + u32 tag = blk_mq_unique_tag(sc-request); + u16 hwq = blk_mq_unique_tag_to_hwq(tag); + + return vscsi-req_vqs[hwq]; +} + static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi, struct virtio_scsi_target_state *tgt) { @@ -604,7 +613,12 @@ static int virtscsi_queuecommand_multi(struct Scsi_Host *sh, struct virtio_scsi *vscsi = shost_priv(sh); struct virtio_scsi_target_state *tgt = scsi_target(sc-device)-hostdata; - struct virtio_scsi_vq *req_vq = virtscsi_pick_vq(vscsi, tgt); + struct virtio_scsi_vq *req_vq; + + if (shost_use_blk_mq(sh)) + req_vq = virtscsi_pick_vq_mq(vscsi, sc); + else + req_vq = virtscsi_pick_vq(vscsi, tgt); return virtscsi_queuecommand(vscsi, req_vq, sc); } @@ -983,6 +997,7 @@ static int virtscsi_probe(struct virtio_device *vdev) shost-max_id = num_targets; shost-max_channel = 0; shost-max_cmd_len = VIRTIO_SCSI_CDB_SIZE; + shost-nr_hw_queues = num_queues; if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) { host_prot = SHOST_DIF_TYPE1_PROTECTION | SHOST_DIF_TYPE2_PROTECTION | Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] virtio_scsi: support multi hw queue of blk-mq
On 10/11/2014 17:05, Christoph Hellwig wrote: a) there is no multiath support for it, and we simply can't break existing setups that use. b) there is no support for I/O schedulers at all. This might be okay for virtio-scsi, where in general you have a host scheduler, but for real hardware that still has to deal with more spinning rust than solid state it's a blocker. Fortuntely Jens is working on that one. It is sort-of okay for virtio-scsi (and virtio-blk). Deadline provides a good speedup over noop for virtual disks, even in the presence of a host scheduler---though nobody ever benchmarked why. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] virtio_scsi: support multi hw queue of blk-mq
On 09/11/2014 17:57, Ming Lei wrote: Since virtio_scsi has supported multi virtqueue already, it is natural to map virtque to hw-queue of blk-mq. Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Ming Lei ming@canonical.com --- drivers/scsi/virtio_scsi.c | 154 1 file changed, 14 insertions(+), 140 deletions(-) Nice. :) FWIW, instead of force_blk_mq, it would be fine for me to make virtio-scsi use a single queue if not using blk-mq. Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] sd: Disable discard_zeroes_data for UNMAP
On 07/11/2014 06:08, Martin K. Petersen wrote: The T10 SBC UNMAP command does not provide any hard guarantees that blocks will return zeroes on a subsequent READ. This is due to the fact that the device server is free to silently ignore all or parts of the request. The only way to ensure that a block consistently returns zeroes after being unmapped is to use WRITE SAME with the UNMAP bit set. Should the device be unable to unmap one or more blocks described by the command it is required to manually write zeroes to them. Until now we have preferred UNMAP over the WRITE SAME variants to accommodate thinly provisioned devices that predated the final SBC-3 spec. This patch changes the heuristic so that we favor WRITE SAME(16) or (10) over UNMAP if these commands are marked as supported in the Logical Block Provisioning VPD page. The patch also disables discard_zeroes_data for devices operating in UNMAP mode. Signed-off-by: Martin K. Petersen martin.peter...@oracle.com --- drivers/scsi/sd.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index b041eca8955d..95bfb7bfbb9d 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -656,7 +656,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) unsigned int logical_block_size = sdkp-device-sector_size; unsigned int max_blocks = 0; - q-limits.discard_zeroes_data = sdkp-lbprz; + q-limits.discard_zeroes_data = 0; q-limits.discard_alignment = sdkp-unmap_alignment * logical_block_size; q-limits.discard_granularity = @@ -680,11 +680,13 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) case SD_LBP_WS16: max_blocks = min_not_zero(sdkp-max_ws_blocks, (u32)SD_MAX_WS16_BLOCKS); + q-limits.discard_zeroes_data = sdkp-lbprz; break; case SD_LBP_WS10: max_blocks = min_not_zero(sdkp-max_ws_blocks, (u32)SD_MAX_WS10_BLOCKS); + q-limits.discard_zeroes_data = sdkp-lbprz; break; case SD_LBP_ZERO: @@ -2622,12 +2624,12 @@ static void sd_read_block_limits(struct scsi_disk *sdkp) } else {/* LBP VPD page tells us what to use */ - if (sdkp-lbpu sdkp-max_unmap_blocks) - sd_config_discard(sdkp, SD_LBP_UNMAP); - else if (sdkp-lbpws) + if (sdkp-lbpws) sd_config_discard(sdkp, SD_LBP_WS16); else if (sdkp-lbpws10) sd_config_discard(sdkp, SD_LBP_WS10); + else if (sdkp-lbpu sdkp-max_unmap_blocks) + sd_config_discard(sdkp, SD_LBP_UNMAP); else sd_config_discard(sdkp, SD_LBP_DISABLE); } Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
Il 09/10/2014 06:14, Nicholas A. Bellinger ha scritto: AFAICT from qemu code, the ioctl VHOST_SCSI_CLEAR_ENDPOINT is always called during shutdown in order to release the endpoint and drop this new configfs dependency. As far as I can see, the only path leading to the ioctl is vhost_scsi_set_status-vhost_scsi_stop. That only happens if the guest driver resets the device upon shutdown, or via vhost_scsi_unrealize as you pointed out. But unrealize() is only called when a device is hot-unplugged. It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the quit command, because no attempt is done to bring down the VM data structures (or free memory, or close file descriptors) in case of a fatal exit. The kernel should do that for us. Besides that... The question is, what happens when qemu crashes..? Is there currently an assurance that VHOST_SCSI_CLEAR_ENDPOINT is called via the normal VirtioDeviceClass-unrealize() when qemu exits abnormally..? ... of course nothing is called if you SIGKILL QEMU. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
Il 09/10/2014 10:49, Paolo Bonzini ha scritto: It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the quit command, because no attempt is done to bring down the VM data structures (or free memory, or close file descriptors) in case of a fatal exit. The kernel should do that for us. ... and in the case of vhost-scsi, doesn't it do that when vhost_scsi_release calls vhost_scsi_clear_endpoint? Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 29/08/2014 08:08, Hannes Reinecke ha scritto: No. FAILED for any eh_abort_cmd() means that the TMF hasn't been sent. So the midlayer escalates to the next EH step. The command will only ever be re-issued once EH completes. Then the answer to Hans's question is yes. It is legal to call -scsi_done() after the eh_abort handler returns FAILED. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 28/08/2014 14:26, Hans de Goede ha scritto: Then, blk_complete_request will do nothing because its call to blk_mark_rq_complete returns true. All this, of course, as long as -scsi_done is called _before_ eh_abort returns. What about calling scsi_done after eh_abort if eh_abort returned FAILED? I invoke the fifth amendment. :) Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 28/08/2014 16:17, Hannes Reinecke ha scritto: As mentioned earlier, as soon as SCSI EH is invoked control is assumed to be transferred back to the SCSI midlayer. How the midlayer interprets any return value from the various eh_XX callbacks is immaterial to the LLDD. So even if the eh_abort returns FAILED control is still with the SCSI midlayer, so the earlier statements apply. IE the command will be short-circuited by the block layer anyway if -scsi_done() is called. As I parsed it, the question is not whether the short-circuiting will happen. It's whether you will have use-after-free bugs or not if you call -scsi_done() after eh_abort returns FAILED. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 28/08/2014 17:50, Elliott, Robert (Server Storage) ha scritto: Is the block layer prevented from issuing a new command with the same tag before the error handling is finished? Tags are chosen by the LLDs, so it's up to it to pick the right tags. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 23/08/2014 16:52, Hans de Goede ha scritto: Hi All, Now that the UAS driver is no longer marked as CONFIG_BROKEN, I'm getting quite a few bug reports about issues with UAS drives. One if the issues is that there might be a number of bugs in the abort handling path, as I don't think that was ever tested properly. So I'm wondering is there a way to test the abort path with a real drive? E.G. submit some command which is known to take a significant amount of time, and then abort it right after submitting ? You could have some luck with QEMU's UAS emulation. If you set QEMU's I/O throttling options low enough (e.g. 100KB/sec), and then use dd with iflag=direct and a largish block size (a few MBs), the guest should abort its I/O soon enough. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 25/08/2014 12:28, Bart Van Assche ha scritto: From SPC-4: 7.5.8 Control mode page [ ... ] A task aborted status (TAS) bit set to zero specifies that aborted commands shall be terminated by the device server without any response to the application client. A TAS bit set to one specifies that commands aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received shall be completed with TASK ABORTED status (see SAM-5). Note the aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received. In practice, this means that TASK ABORTED should only happen if you use the CLEAR TASK SET tmf and TST is not set to 001b (i.e. _not_ to per I_T nexus) in the Control mode page. It should never happen for a pen drive. Setting TASK ABORTED aside, the important part is that an abort can do one of two things: - complete the command, and then eh_abort should return after the driver has noticed the completion and called the -scsi_done callback for the Scsi_Cmnd*. - abort the command, and then the driver should never call the -scsi_done callback for the Scsi_Cmnd*. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio_scsi: check on resp-sense_len instead of 'sense_buffer'
Il 18/07/2014 16:57, Ming Lei ha scritto: - if (sc-sense_buffer) { + if (resp-sense_len) { In the (unlikely) case that sc-sense_buffer == NULL, you'd pass a NULL to memcpy. If you want, you can change this if to if (sc-sense_buffer resp-sense_len) but frankly it seems like slightly pointless churn to me. Paolo memcpy(sc-sense_buffer, resp-sense, min_t(u32, resp-sense_len, VIRTIO_SCSI_SENSE_SIZE)); - if (resp-sense_len) - set_driver_byte(sc, DRIVER_SENSE); + set_driver_byte(sc, DRIVER_SENSE); } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] virtio-scsi queue for 3.17
Christoph asked me to rebase the two virtio-scsi patches from http://thread.gmane.org/gmane.linux.kernel/1717796 that do not apply anymore, patch 2 and 6. These are on top of his drivers-for-3.16 branch, more precisely commit 8faeb529b2da (virtio-scsi: fix various bad behavior on aborted requests). Paolo Ming Lei (1): virtio-scsi: replace target spinlock with seqcount Venkatesh Srinivas (1): virtio-scsi: Implement change_queue_depth for virtscsi targets drivers/scsi/virtio_scsi.c | 75 ++ 1 file changed, 62 insertions(+), 13 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] virtio-scsi: replace target spinlock with seqcount
From: Ming Lei ming@canonical.com The spinlock of tgt_lock is only for serializing read and write req_vq, one lockless seqcount is enough for the purpose. On one 16core VM with vhost-scsi backend, the patch can improve IOPS with 3% on random read test. Signed-off-by: Ming Lei ming@canonical.com [Add initialization in virtscsi_target_alloc. - Paolo] Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- drivers/scsi/virtio_scsi.c | 42 +- 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 308256b5e4cb..f67a872de14b 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -27,6 +27,7 @@ #include scsi/scsi_host.h #include scsi/scsi_device.h #include scsi/scsi_cmnd.h +#include linux/seqlock.h #define VIRTIO_SCSI_MEMPOOL_SZ 64 #define VIRTIO_SCSI_EVENT_LEN 8 @@ -75,18 +76,16 @@ struct virtio_scsi_vq { * queue, and also lets the driver optimize the IRQ affinity for the virtqueues * (each virtqueue's affinity is set to the CPU that owns the queue). * - * tgt_lock is held to serialize reading and writing req_vq. Reading req_vq - * could be done locklessly, but we do not do it yet. + * tgt_seq is held to serialize reading and writing req_vq. * * Decrements of reqs are never concurrent with writes of req_vq: before the * decrement reqs will be != 0; after the decrement the virtqueue completion * routine will not use the req_vq so it can be changed by a new request. - * Thus they can happen outside the tgt_lock, provided of course we make reqs + * Thus they can happen outside the tgt_seq, provided of course we make reqs * an atomic_t. */ struct virtio_scsi_target_state { - /* This spinlock never held at the same time as vq_lock. */ - spinlock_t tgt_lock; + seqcount_t tgt_seq; /* Count of outstanding requests. */ atomic_t reqs; @@ -559,19 +558,33 @@ static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi, unsigned long flags; u32 queue_num; - spin_lock_irqsave(tgt-tgt_lock, flags); + local_irq_save(flags); + if (atomic_inc_return(tgt-reqs) 1) { + unsigned long seq; + + do { + seq = read_seqcount_begin(tgt-tgt_seq); + vq = tgt-req_vq; + } while (read_seqcount_retry(tgt-tgt_seq, seq)); + } else { + /* no writes can be concurrent because of atomic_t */ + write_seqcount_begin(tgt-tgt_seq); + + /* keep previous req_vq if a reader just arrived */ + if (unlikely(atomic_read(tgt-reqs) 1)) { + vq = tgt-req_vq; + goto unlock; + } - if (atomic_inc_return(tgt-reqs) 1) - vq = tgt-req_vq; - else { queue_num = smp_processor_id(); while (unlikely(queue_num = vscsi-num_queues)) queue_num -= vscsi-num_queues; - tgt-req_vq = vq = vscsi-req_vqs[queue_num]; + unlock: + write_seqcount_end(tgt-tgt_seq); } + local_irq_restore(flags); - spin_unlock_irqrestore(tgt-tgt_lock, flags); return vq; } @@ -667,14 +680,17 @@ static int virtscsi_abort(struct scsi_cmnd *sc) static int virtscsi_target_alloc(struct scsi_target *starget) { + struct Scsi_Host *sh = dev_to_shost(starget-dev.parent); + struct virtio_scsi *vscsi = shost_priv(sh); + struct virtio_scsi_target_state *tgt = kmalloc(sizeof(*tgt), GFP_KERNEL); if (!tgt) return -ENOMEM; - spin_lock_init(tgt-tgt_lock); + seqcount_init(tgt-tgt_seq); atomic_set(tgt-reqs, 0); - tgt-req_vq = NULL; + tgt-req_vq = vscsi-req_vqs[0]; starget-hostdata = tgt; return 0; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/5] scsi: remove various exports that were only used by scsi_tgt
*); extern void scsi_finish_command(struct scsi_cmnd *cmd); extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count, size_t *offset, size_t *len); extern void scsi_kunmap_atomic_sg(void *virt); -extern int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask); -extern void scsi_release_buffers(struct scsi_cmnd *cmd); - extern int scsi_dma_map(struct scsi_cmnd *cmd); extern void scsi_dma_unmap(struct scsi_cmnd *cmd); diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 28336c6..4c38c16 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -795,8 +795,6 @@ extern struct Scsi_Host *scsi_host_lookup(unsigned short); extern const char *scsi_host_state_name(enum scsi_host_state); extern void scsi_cmd_get_serial(struct Scsi_Host *, struct scsi_cmnd *); -extern u64 scsi_calculate_bounce_limit(struct Scsi_Host *); - static inline int __must_check scsi_add_host(struct Scsi_Host *host, struct device *dev) { Reviewed-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tgt infrastructure removal
Il 04/07/2014 11:36, Christoph Hellwig ha scritto: I got an older Ack from Tomo for the removal, and some sort of acks for the ibmvscsi target removal from Brian, Paolo and Nathan. Can I get some formal reviews for the code removal patches (1-4) so I can queue this up for 3.17? Not that there's much to review there. :) Reviewed-by: Paolo Bonzini pbonz...@redhat.com Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] ibmvstgt: remove
Il 24/06/2014 16:28, Christoph Hellwig ha scritto: Adding Paul and Nathan to cc here. I'm pretty sure the backend for ibmvscsi in KVM was all done in qemu and there is no dependency on ibmvstgt. FWIW the ibmvscsi backend is indeed entirely in userspace (hw/scsi/spapr_vscsi.c in the QEMU tree). Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html