[PATCH v4 13/15] block: add bsg_job_put() and bsg_job_get()

2016-11-17 Thread Johannes Thumshirn
Add bsg_job_put() and bsg_job_get() so don't need to export bsg_destroy_job() any more. Signed-off-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke --- block/bsg-lib.c | 17 ++--- drivers/scsi/scsi_transport_fc.c | 4 ++--

[PATCH 09/11] virtio: provide a method to get the IRQ affinity mask for a virtqueue

2016-11-17 Thread Christoph Hellwig
This basically passed up the pci_irq_get_affinity information through virtio through an optional get_vq_affinity method. It is only implemented by the PCI backend for now, and only when we use per-virtqueue IRQs. Signed-off-by: Christoph Hellwig ---

[PATCH 11/11] virtio_blk: use virtio IRQ affinity

2016-11-17 Thread Christoph Hellwig
Use automatic IRQ affinity assignment in the virtio layer if available, and build the blk-mq queues based on it. Signed-off-by: Christoph Hellwig --- drivers/block/virtio_blk.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git

[PATCH v4 11/15] scsi: fc: use bsg_softirq_done

2016-11-17 Thread Johannes Thumshirn
bsg_softirq_done() and fc_bsg_softirq_done() are copies of each other, so ditch the fc specific one. Signed-off-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke --- block/bsg-lib.c | 3 ++- drivers/scsi/scsi_transport_fc.c | 15

[PATCH v4 15/15] block: unexport bsg_softirq_done() again

2016-11-17 Thread Johannes Thumshirn
Unexport bsg_softirq_done() again, we don't need it outside of bsg-lib.c anymore now that scsi_transport_fc is a pure bsg-lib client. Signed-off-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke --- block/bsg-lib.c | 3 +-- include/linux/bsg-lib.h |

[PATCH 02/11] virtio_pci: remove the call to vp_free_vectors in vp_request_msix_vectors

2016-11-17 Thread Christoph Hellwig
vp_request_msix_vectors is only called by vp_try_to_find_vqs, which already calls vp_free_vectors through vp_del_vqs in the failure case. Signed-off-by: Christoph Hellwig --- drivers/virtio/virtio_pci_common.c | 1 - 1 file changed, 1 deletion(-) diff --git

[PATCH 05/11] virtio_pci: use shared interrupts for virtqueues

2016-11-17 Thread Christoph Hellwig
This lets IRQ layer handle dispatching IRQs to separate handlers for the case where we don't have per-VQ MSI-X vectors, and allows us to greatly simplify the code based on the assumption that we always have interrupt vector 0 (legacy INTx or config interrupt for MSI-X) available, and any other

automatic IRQ affinity for virtio

2016-11-17 Thread Christoph Hellwig
Hi Michael, this series contains a couple cleanups for the virtio_pci interrupt handling code, including a switch to the new pci_irq_alloc_vectors helper, and support for automatic affinity by the PCI layer if the consumers ask for it. It then converts over virtio_blk to use this functionality

[PATCH 08/11] virtio: allow drivers to request IRQ affinity when creating VQs

2016-11-17 Thread Christoph Hellwig
Add a struct irq_affinity pointer to the find_vqs methods, which if set is used to tell the PCI layer to create the MSI-X vectors for our I/O virtqueues with the proper affinity from the start. Compared to after the fact affinity hints this gives us an instantly working setup and allows to

[PATCH 03/11] virtio_pci: merge vp_free_vectors into vp_del_vqs

2016-11-17 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig --- drivers/virtio/virtio_pci_common.c | 61 +- 1 file changed, 27 insertions(+), 34 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 93700c5..f6c5499 100644

[PATCH v4 08/15] block: add reference counting for struct bsg_job

2016-11-17 Thread Johannes Thumshirn
Add reference counting to 'struct bsg_job' so we can implement a reuqest timeout handler for bsg_jobs, which is needed for Fibre Channel. Signed-off-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke --- block/bsg-lib.c | 7 +--

[PATCH v4 10/15] scsi: fc: Use bsg_destroy_job

2016-11-17 Thread Johannes Thumshirn
fc_destroy_bsgjob() and bsg_destroy_job() are now 1:1 copies, so use the later. As bsg_destroy_job() comes from bsg-lib we need to select it in Kconfig once CONFOG_SCSI_FC_ATTRS is active. Signed-off-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke ---

Re: [PATCH 5/5] nvmet: add support for the Write Zeroes command

2016-11-17 Thread Christoph Hellwig
On Wed, Nov 16, 2016 at 06:47:22PM +0200, Sagi Grimberg wrote: > > + if (__blkdev_issue_zeroout(req->ns->bdev, sector, nr_sector, > > + GFP_KERNEL, , false)) > > + status = NVME_SC_INTERNAL | NVME_SC_DNR; > > + > > + if (bio) { > > +

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Jens Axboe
On 11/17/2016 05:50 AM, Christoph Hellwig wrote: On Wed, Nov 16, 2016 at 09:43:57PM -0700, Jens Axboe wrote: OK, I'm getting reasonably happy with it now: http://git.kernel.dk/cgit/linux-block/log/?h=for-4.10/hch-dio Sharing the FUA setting to avoid the manual flush between the two, and

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Jens Axboe
On 11/17/2016 05:46 AM, Christoph Hellwig wrote: On Wed, Nov 16, 2016 at 01:02:05PM -0700, Jens Axboe wrote: I'm fine with this approach, but I would still REALLY like to see blkdev_bio_end_io() split in two, once for sync and once for async. That would be a lot cleaner, imho. Do you still

Re: [PATCH v1 0/7] SED OPAL Library

2016-11-17 Thread Scott Bauer
On Thu, Nov 17, 2016 at 05:12:51AM -0800, Christoph Hellwig wrote: > Hi Scott, > > I took a look at the code and here are some very high level comments: > > - we only call into block_device_operations.sec_ops from the ioctl >handlers. So instead of adding it to the block layer I'd rather >

Re: [PATCH v1 1/7] Include: Add definitions for sed

2016-11-17 Thread Scott Bauer
On Thu, Nov 17, 2016 at 07:22:15AM -0800, Christoph Hellwig wrote: > > @@ -0,0 +1,58 @@ > > +/* > > + * Copyright © 2016 Intel Corporation > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the

[PATCH][V4] nbd: add multi-connection support

2016-11-17 Thread Josef Bacik
NBD can become contended on its single connection. We have to serialize all writes and we can only process one read response at a time. Fix this by allowing userspace to provide multiple connections to a single nbd device. This coupled with block-mq drastically increases performance in

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Jens Axboe
On 11/17/2016 01:17 PM, Christoph Hellwig wrote: Ok, the updated tree is here: http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/new-dio This moves your O_SYNC patch earlier and makes it not directly assign bi_opf. Then comes the main direct I/O path with the various folds, and

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Christoph Hellwig
On Thu, Nov 17, 2016 at 01:30:12PM -0700, Jens Axboe wrote: > The last patch seems to have some issues: > > + bool is_sync = is_sync = is_sync_kiocb(iocb); > > and it's reverting part of the dio->is_sync that was part of my union > waiter/iocb change. So I'll leave that out for now, for

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Jens Axboe
On 11/17/2016 02:04 PM, Christoph Hellwig wrote: On Thu, Nov 17, 2016 at 01:30:12PM -0700, Jens Axboe wrote: The last patch seems to have some issues: + bool is_sync = is_sync = is_sync_kiocb(iocb); and it's reverting part of the dio->is_sync that was part of my union waiter/iocb

[PATCH v4] mm: don't cap request size based on read-ahead setting

2016-11-17 Thread Jens Axboe
We ran into a funky issue, where someone doing 256K buffered reads saw 128K requests at the device level. Turns out it is read-ahead capping the request size, since we use 128K as the default setting. This doesn't make a lot of sense - if someone is issuing 256K reads, they should see 256K reads,

Re: [PATCH v1 1/7] Include: Add definitions for sed

2016-11-17 Thread Christoph Hellwig
> @@ -0,0 +1,58 @@ > +/* > + * Copyright © 2016 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Jens Axboe
On 11/17/2016 06:25 AM, Jens Axboe wrote: On 11/17/2016 05:50 AM, Christoph Hellwig wrote: On Wed, Nov 16, 2016 at 09:43:57PM -0700, Jens Axboe wrote: OK, I'm getting reasonably happy with it now: http://git.kernel.dk/cgit/linux-block/log/?h=for-4.10/hch-dio Sharing the FUA setting to avoid

Re: [PATCH v1 0/7] SED OPAL Library

2016-11-17 Thread Christoph Hellwig
Hi Scott, I took a look at the code and here are some very high level comments: - we only call into block_device_operations.sec_ops from the ioctl handlers. So instead of adding it to the block layer I'd rather structure the code so that the driver itself calls a new common

Re: [PATCH v1 0/7] SED OPAL Library

2016-11-17 Thread Rafael Antognolli
On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote: > On Thu, Nov 17, 2016 at 05:12:51AM -0800, Christoph Hellwig wrote: > > Hi Scott, > > > > I took a look at the code and here are some very high level comments: > > > > - we only call into block_device_operations.sec_ops from the

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Christoph Hellwig
On Thu, Nov 17, 2016 at 08:03:46AM -0700, Jens Axboe wrote: > > Looks like I forgot to push out the latest changes, that is gone. We're > > just using + 1 now to see if we should use the multibio function or not. > > Just to clarify, the correct tip is: Re-checking for-4.10/dio now everything

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Christoph Hellwig
On Wed, Nov 16, 2016 at 01:02:05PM -0700, Jens Axboe wrote: > I'm fine with this approach, but I would still REALLY like to see > blkdev_bio_end_io() split in two, once for sync and once for async. That > would be a lot cleaner, imho. Do you still want that? It's not in your latest tree even

Re: [PATCH v1 0/7] SED OPAL Library

2016-11-17 Thread Scott Bauer
On Thu, Nov 17, 2016 at 11:28:07AM -0800, Christoph Hellwig wrote: > On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote: > > > > I want some further clarification, if you don't mind. We call sec_ops > > inside the actual logic for the opal code. Which is only accessible via the > >

Re: [PATCHSET] Add support for simplified async direct-io

2016-11-17 Thread Christoph Hellwig
Ok, the updated tree is here: http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/new-dio This moves your O_SYNC patch earlier and makes it not directly assign bi_opf. Then comes the main direct I/O path with the various folds, and a use after free fix ported from the iomap code in

Re: [PATCH v1 0/7] SED OPAL Library

2016-11-17 Thread Christoph Hellwig
On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote: > > I want some further clarification, if you don't mind. We call sec_ops > inside the actual logic for the opal code. Which is only accessible via the > ioctls, is that what you were meaning? When you say "the driver calls" > do you

[PATCH v2 2/5] block: add support for REQ_OP_WRITE_ZEROES

2016-11-17 Thread Chaitanya Kulkarni
From: Chaitanya Kulkarni This adds a new block layer operation to zero out a range of LBAs. This allows to implement zeroing for devices that don't use either discard with a predictable zero pattern or WRITE SAME of zeroes. The prominent example of that is NVMe with

[md PATCH 1/6] md/failfast: add failfast flag for md to be used by some personalities.

2016-11-17 Thread NeilBrown
This patch just adds a 'failfast' per-device flag which can be stored in v0.90 or v1.x metadata. The flag is not used yet but the intent is that it can be used for mirrored (raid1/raid10) arrays where low latency is more important than keeping all devices on-line. Setting the flag for a device

[md PATCH 2/6] md: Use REQ_FAILFAST_* on metadata writes where appropriate

2016-11-17 Thread NeilBrown
This can only be supported on personalities which ensure that md_error() never causes an array to enter the 'failed' state. i.e. if marking a device Faulty would cause some data to be inaccessible, the device is status is left as non-Faulty. This is true for RAID1 and RAID10. If we get a

[md PATCH 5/6] md/raid10: add failfast handling for reads.

2016-11-17 Thread NeilBrown
If a device is marked FailFast, and it is not the only device we can read from, we mark the bio as MD_FAILFAST. If this does fail-fast, we don't try read repair but just allow failure. If it was the last device, it doesn't get marked Faulty so the retry happens on the same device - this time

[md PATCH 6/6] md/raid10: add failfast handling for writes.

2016-11-17 Thread NeilBrown
When writing to a fastfail device, we use MD_FASTFAIL unless it is the only device being written to. For resync/recovery, assume there was a working device to read from so always use MD_FASTFAIL. If a write for resync/recovery fails, we just fail the device - there is not much else to do. If a

[md PATCH 4/6] md/raid1: add failfast handling for writes.

2016-11-17 Thread NeilBrown
When writing to a fastfail device we use MD_FASTFAIL unless it is the only device being written to. For resync/recovery, assume there was a working device to read from so always use REQ_FASTFAIL_DEV. If a write for resync/recovery fails, we just fail the device - there is not much else to do.

[PATCH/RFC] add "failfast" support for raid1/raid10.

2016-11-17 Thread NeilBrown
Hi, I've been sitting on these patches for a while because although they solve a real problem, it is a fairly limited use-case, and I don't really like some of the details. So I'm posting them as RFC in the hope that a different perspective might help me like them better, or find a better

[PATCH] block: call trace_block_split() from bio_split()

2016-11-17 Thread NeilBrown
Somewhere around Commit: 20d0189b1012 ("block: Introduce new bio_split()") and Commit: 4b1faf931650 ("block: Kill bio_pair_split()") in 3.14 we lost the call to trace_block_split() from bio_split(). Commit: cda22646adaa ("block: add call to split trace point") in 4.5 added it back for

[md PATCH 3/6] md/raid1: add failfast handling for reads.

2016-11-17 Thread NeilBrown
If a device is marked FailFast and it is not the only device we can read from, we mark the bio with REQ_FAILFAST_* flags. If this does fail, we don't try read repair but just allow failure. If it was the last device it doesn't fail of course, so the retry happens on the same device - this time

Re: [PATCH v2 2/5] block: add support for REQ_OP_WRITE_ZEROES

2016-11-17 Thread Martin K. Petersen
> "Chaitanya" == Chaitanya Kulkarni writes: Chaitanya> This adds a new block layer operation to zero out a range of Chaitanya> LBAs. This allows to implement zeroing for devices that don't Chaitanya> use either discard with a predictable zero pattern or WRITE

Re: [PATCH 2/5] block: add support for REQ_OP_WRITE_ZEROES

2016-11-17 Thread chaitany kulkarni
Incorporated the comments and sent new patch. On Wed, Nov 16, 2016 at 6:48 PM, Martin K. Petersen wrote: >> "Keith" == Keith Busch writes: > > Keith> Your maximum bi_size exceeds the 2-bytes an NVMe Write Zeroes > Keith> command provides

Re: [PATCH v2 2/5] block: add support for REQ_OP_WRITE_ZEROES

2016-11-17 Thread Christoph Hellwig
On Thu, Nov 17, 2016 at 02:17:11PM -0800, Chaitanya Kulkarni wrote: > From: Chaitanya Kulkarni > > This adds a new block layer operation to zero out a range of > LBAs. This allows to implement zeroing for devices that don't use > either discard with a predictable

Re: [PATCH/RFC] add "failfast" support for raid1/raid10.

2016-11-17 Thread Hannes Reinecke
(Seeing that it was me who initiated those patches I guess I should speak up here) On 11/18/2016 06:16 AM, NeilBrown wrote: > Hi, > > I've been sitting on these patches for a while because although they > solve a real problem, it is a fairly limited use-case, and I don't > really like some of

Re: [PATCH v4] mm: don't cap request size based on read-ahead setting

2016-11-17 Thread Hillf Danton
On Friday, November 18, 2016 5:23 AM Jens Axboe wrote: > > We ran into a funky issue, where someone doing 256K buffered reads saw > 128K requests at the device level. Turns out it is read-ahead capping > the request size, since we use 128K as the default setting. This doesn't > make a lot of