Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 8:03 PM, Ming Lei wrote: > On Wed, Dec 05, 2018 at 10:58:02AM +0800, Ming Lei wrote: >> On Tue, Dec 04, 2018 at 07:30:24PM -0700, Jens Axboe wrote: >>> On 12/4/18 7:27 PM, Ming Lei wrote: On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: > On 12/4/18 6:37 PM, Ming Lei

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 7:58 PM, Ming Lei wrote: > On Tue, Dec 04, 2018 at 07:30:24PM -0700, Jens Axboe wrote: >> On 12/4/18 7:27 PM, Ming Lei wrote: >>> On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: On 12/4/18 6:37 PM, Ming Lei wrote: > On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Ming Lei
On Wed, Dec 05, 2018 at 10:58:02AM +0800, Ming Lei wrote: > On Tue, Dec 04, 2018 at 07:30:24PM -0700, Jens Axboe wrote: > > On 12/4/18 7:27 PM, Ming Lei wrote: > > > On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: > > >> On 12/4/18 6:37 PM, Ming Lei wrote: > > >>> On Tue, Dec 04, 2018

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Ming Lei
On Tue, Dec 04, 2018 at 07:30:24PM -0700, Jens Axboe wrote: > On 12/4/18 7:27 PM, Ming Lei wrote: > > On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: > >> On 12/4/18 6:37 PM, Ming Lei wrote: > >>> On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: > If we attempt a direct

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 7:27 PM, Ming Lei wrote: > On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: >> On 12/4/18 6:37 PM, Ming Lei wrote: >>> On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: If we attempt a direct issue to a SCSI device, and it returns BUSY, then we queue the

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Ming Lei
On Tue, Dec 04, 2018 at 07:16:11PM -0700, Jens Axboe wrote: > On 12/4/18 6:37 PM, Ming Lei wrote: > > On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: > >> If we attempt a direct issue to a SCSI device, and it returns BUSY, then > >> we queue the request up normally. However, the SCSI

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 6:38 PM, Guenter Roeck wrote: > On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: >> If we attempt a direct issue to a SCSI device, and it returns BUSY, then >> we queue the request up normally. However, the SCSI layer may have >> already setup SG tables etc for this

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 7:16 PM, Jens Axboe wrote: > On 12/4/18 6:37 PM, Ming Lei wrote: >> On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: >>> If we attempt a direct issue to a SCSI device, and it returns BUSY, then >>> we queue the request up normally. However, the SCSI layer may have >>> already

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
On 12/4/18 6:37 PM, Ming Lei wrote: > On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: >> If we attempt a direct issue to a SCSI device, and it returns BUSY, then >> we queue the request up normally. However, the SCSI layer may have >> already setup SG tables etc for this particular

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Guenter Roeck
On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: > If we attempt a direct issue to a SCSI device, and it returns BUSY, then > we queue the request up normally. However, the SCSI layer may have > already setup SG tables etc for this particular command. If we later > merge with this

Re: [PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Ming Lei
On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: > If we attempt a direct issue to a SCSI device, and it returns BUSY, then > we queue the request up normally. However, the SCSI layer may have > already setup SG tables etc for this particular command. If we later > merge with this

[PATCH 26/26] aio: add support for submission/completion rings

2018-12-04 Thread Jens Axboe
Experimental support for submitting and completing IO through rings shared between the application and kernel. The submission rings are struct iocb, like we would submit through io_submit(), and the completion rings are struct io_event, like we would pass in (and copy back) from io_getevents().

[PATCH 25/26] aio: split old ring complete out from aio_complete()

2018-12-04 Thread Jens Axboe
Signed-off-by: Jens Axboe --- fs/aio.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 1c8a8bb631a9..39aaffd6d436 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1218,12 +1218,9 @@ static void aio_fill_event(struct io_event *ev, struct

[PATCH 23/26] fs: add support for mapping an ITER_BVEC for O_DIRECT

2018-12-04 Thread Jens Axboe
This adds support for sync/async O_DIRECT to make a bvec type iter for bdev access, as well as iomap. Signed-off-by: Jens Axboe --- fs/block_dev.c | 16 fs/iomap.c | 5 - 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c

[PATCH 24/26] aio: add support for pre-mapped user IO buffers

2018-12-04 Thread Jens Axboe
If we have fixed user buffers, we can map them into the kernel when we setup the io_context. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must set both IOCTX_FLAG_USERIOCB, to provide iocb's in userspace, and then

[PATCH 21/26] block: add BIO_HOLD_PAGES flag

2018-12-04 Thread Jens Axboe
For user mapped IO, we do get_user_pages() upfront, and then do a put_page() on each page at end_io time to release the page reference. In preparation for having permanently mapped pages, add a BIO_HOLD_PAGES flag that tells us not to release the pages, the caller will do that. Signed-off-by:

[PATCH 19/26] aio: split iocb init from allocation

2018-12-04 Thread Jens Axboe
In preparation from having pre-allocated requests, that we then just need to initialize before use. Signed-off-by: Jens Axboe --- fs/aio.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 634b540b0c92..416bb2e365e0 100644 ---

[PATCH 04/26] block: use REQ_HIPRI_ASYNC for non-sync polled IO

2018-12-04 Thread Jens Axboe
Tell the block layer if it's a sync or async polled request, so it can do the right thing. Signed-off-by: Jens Axboe --- fs/block_dev.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 6de8d35f6e41..b8f574615792 100644 ---

[PATCH 15/26] aio: support for IO polling

2018-12-04 Thread Jens Axboe
Add polled variants of PREAD/PREADV and PWRITE/PWRITEV. These act like their non-polled counterparts, except we expect to poll for completion of them. The polling happens at io_getevent() time, and works just like non-polled IO. To setup an io_context for polled IO, the application must call

[PATCH 03/26] block: wire up block device iopoll method

2018-12-04 Thread Jens Axboe
From: Christoph Hellwig Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 ++ 1 file changed, 10 insertions(+) diff --git

[PATCH 02/26] block: add REQ_HIPRI_ASYNC

2018-12-04 Thread Jens Axboe
For the upcoming async polled IO, we can't sleep allocating requests. If we do, then we introduce a deadlock where the submitter already has async polled IO in-flight, but can't wait for them to complete since polled requests must be active found and reaped. Signed-off-by: Jens Axboe ---

[PATCH 20/26] aio: batch aio_kiocb allocation

2018-12-04 Thread Jens Axboe
Similarly to how we use the state->ios_left to know how many references to get to a file, we can use it to allocate the aio_kiocb's we need in bulk. Signed-off-by: Jens Axboe --- fs/aio.c | 47 +++ 1 file changed, 39 insertions(+), 8 deletions(-)

[PATCH 13/26] aio: add io_setup2() system call

2018-12-04 Thread Jens Axboe
This is just like io_setup(), except add a flags argument to let the caller control/define some of the io_context behavior. Outside of the flags, we add an iocb array and two user pointers for future use. Signed-off-by: Jens Axboe --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + fs/aio.c

[PATCH 11/26] aio: split out iocb copy from io_submit_one()

2018-12-04 Thread Jens Axboe
In preparation of handing in iocbs in a different fashion as well. Also make it clear that the iocb being passed in isn't modified, by marking it const throughout. Signed-off-by: Jens Axboe --- fs/aio.c | 68 +++- 1 file changed, 38

[PATCH 14/26] aio: add support for having user mapped iocbs

2018-12-04 Thread Jens Axboe
For io_submit(), we have to first copy each pointer to an iocb, then copy the iocb. The latter is 64 bytes in size, and that's a lot of copying for a single IO. Add support for setting IOCTX_FLAG_USERIOCB through the new io_setup2() system call, which allows the iocbs to reside in userspace. If

[PATCH 16/26] aio: add submission side request cache

2018-12-04 Thread Jens Axboe
We have to add each submitted polled request to the io_context poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by:

[PATCH 01/26] fs: add an iopoll method to struct file_operations

2018-12-04 Thread Jens Axboe
From: Christoph Hellwig This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to

[PATCH 12/26] aio: abstract out io_event filler helper

2018-12-04 Thread Jens Axboe
Signed-off-by: Jens Axboe --- fs/aio.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 06c8bcc72496..173f1f79dc8f 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1063,6 +1063,15 @@ static inline void iocb_put(struct aio_kiocb *iocb)

[PATCH 22/26] block: implement bio helper to add iter bvec pages to bio

2018-12-04 Thread Jens Axboe
For an ITER_BVEC, we can just iterate the iov and add the pages to the bio directly. Signed-off-by: Jens Axboe --- block/bio.c | 27 +++ include/linux/bio.h | 1 + 2 files changed, 28 insertions(+) diff --git a/block/bio.c b/block/bio.c index

[PATCH 07/26] aio: separate out ring reservation from req allocation

2018-12-04 Thread Jens Axboe
From: Christoph Hellwig This is in preparation for certain types of IO not needing a ring reserveration. Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 30 +- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/aio.c

[PATCH 18/26] aio: use fget/fput_many() for file references

2018-12-04 Thread Jens Axboe
On the submission side, add file reference batching to the aio_submit_state. We get as many references as the number of iocbs we are submitting, and drop unused ones if we end up switching files. The assumption here is that we're usually only dealing with one fd, and if there are multiple,

[PATCH 17/26] fs: add fget_many() and fput_many()

2018-12-04 Thread Jens Axboe
Some uses cases repeatedly get and put references to the same file, but the only exposed interface is doing these one at the time. As each of these entail an atomic inc or dec on a shared structure, that cost can add up. Add fget_many(), which works just like fget(), except it takes an argument

[PATCH 06/26] aio: use assigned completion handler

2018-12-04 Thread Jens Axboe
We know this is a read/write request, but in preparation for having different kinds of those, ensure that we call the assigned handler instead of assuming it's aio_complete_rq(). Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH 10/26] aio: use iocb_put() instead of open coding it

2018-12-04 Thread Jens Axboe
Replace the percpu_ref_put() + kmem_cache_free() with a call to iocb_put() instead. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index ed6c3914477a..cf93b92bfb1e 100644 ---

[PATCH 09/26] aio: only use blk plugs for > 2 depth submissions

2018-12-04 Thread Jens Axboe
Plugging is meant to optimize submission of a string of IOs, if we don't have more than 2 being submitted, don't bother setting up a plug. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git

[PATCH 05/26] iomap: wire up the iopoll method

2018-12-04 Thread Jens Axboe
From: Christoph Hellwig Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: Christoph Hellwig

[PATCHSET v5] Support for polled aio

2018-12-04 Thread Jens Axboe
For the grand introduction to this feature, see my original posting here: https://lore.kernel.org/linux-block/20181117235317.7366-1-ax...@kernel.dk/ and refer to the previous postings of this patchset for whatever features were added there. Particularly v4 has some performance results:

[PATCH] blk-mq: fix corruption with direct issue

2018-12-04 Thread Jens Axboe
If we attempt a direct issue to a SCSI device, and it returns BUSY, then we queue the request up normally. However, the SCSI layer may have already setup SG tables etc for this particular command. If we later merge with this request, then the old tables are no longer valid. Once we issue the IO,

Re: [PATCH 2/3] blktests: add python scripts for parsing fio json output

2018-12-04 Thread Federico Motta
On 12/4/18 6:47 PM, Josef Bacik wrote: > I wrote these scripts for xfstests, just copying them over to blktests > as well. The terse output fio support that blktests doesn't get all of > the various fio performance things that we may want to look at, this > gives us a lot of flexibility around

Re: [PATCH 3/3] blktests: block/025: an io.latency test

2018-12-04 Thread Federico Motta
On 12/4/18 6:47 PM, Josef Bacik wrote: > This is a test to verify io.latency is working properly. It does this > by first running a fio job by itself to figure out how fast it runs. > Then we calculate some thresholds, set up 2 cgroups, a fast and a slow > cgroup, and then run the same job in

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread James Smart
On 12/4/2018 1:21 PM, Keith Busch wrote: On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote: I disagree.  The cases I've run into are on the admin queue - where we are sending io to initialize the controller when another error/reset occurs, and the checks are required to

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Keith Busch
On Tue, Dec 04, 2018 at 02:21:17PM -0700, Keith Busch wrote: > On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote: > > On 12/4/2018 9:48 AM, Keith Busch wrote: > > > Once quiesced, the proposed iterator can handle the final termination > > > of the request, perform failover, or some other

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Keith Busch
On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote: > > > On 12/4/2018 9:48 AM, Keith Busch wrote: > > On Tue, Dec 04, 2018 at 09:38:29AM -0800, Sagi Grimberg wrote: > > > > > > Yes, I'm very much in favour of this, too. > > > > > > We always have this IMO slightly weird notion of

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread James Smart
On 12/4/2018 9:48 AM, Keith Busch wrote: On Tue, Dec 04, 2018 at 09:38:29AM -0800, Sagi Grimberg wrote: Yes, I'm very much in favour of this, too. We always have this IMO slightly weird notion of stopping the queue, set some error flags in the driver, then _restarting_ the queue, just so

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread James Smart
On 12/4/2018 9:23 AM, Sagi Grimberg wrote: Yes, I'm very much in favour of this, too. We always have this IMO slightly weird notion of stopping the queue, set some error flags in the driver, then _restarting_ the queue, just so that the driver then sees the error flag and terminates the

Re: block and nvme polling improvements V3

2018-12-04 Thread Jens Axboe
On 12/2/18 9:46 AM, Christoph Hellwig wrote: > Hi all, > > this series optimizes a few bits in the block layer and nvme code > related to polling. > > It starts by moving the queue types recently introduce entirely into > the block layer instead of requiring an indirect call for them. > > It

Re: [PATCH 06/13] nvme-pci: refactor nvme_disable_io_queues

2018-12-04 Thread Jens Axboe
On 12/4/18 8:05 AM, Christoph Hellwig wrote: > On Mon, Dec 03, 2018 at 05:00:59PM -0800, Sagi Grimberg wrote: >> >>> @@ -2428,7 +2426,8 @@ static void nvme_dev_disable(struct nvme_dev *dev, >>> bool shutdown) >>> nvme_stop_queues(>ctrl); >>> if (!dead && dev->ctrl.queue_count > 0)

Re: [PATCH 05/13] nvme-pci: consolidate code for polling non-dedicated queues

2018-12-04 Thread Jens Axboe
On 12/4/18 10:13 AM, Sagi Grimberg wrote: > +static int nvme_poll_irqdisable(struct nvme_queue *nvmeq, unsigned int tag) >>> >>> Do we still need to carry the tag around? >> >> Yes, the timeout handler polls for a specific tag. > > Does it have to? the documentation suggests that we

Re: [PATCH 05/27] block: ensure that async polled IO is marked REQ_NOWAIT

2018-12-04 Thread Jens Axboe
On 12/4/18 7:48 AM, Christoph Hellwig wrote: > On Fri, Nov 30, 2018 at 10:17:49AM -0700, Jens Axboe wrote: >>> Setting REQ_NOWAIT from inside the block layer will make the code that >>> submits requests harder to review. Have you considered to make this code >>> fail I/O if REQ_NOWAIT has not been

[PATCH 0/3] Unify the throttling code for wbt and io-latency

2018-12-04 Thread Josef Bacik
Originally when I wrote io-latency and the rq_qos code to provide a common base between wbt and io-latency I left out the throttling part. These were basically the same, but slightly different in both cases. The difference was enough and the code wasn't too complicated that I just copied it into

[PATCH 1/3] block: add rq_qos_wait to rq_qos

2018-12-04 Thread Josef Bacik
Originally when I split out the common code from blk-wbt into rq_qos I left the wbt_wait() where it was and simply copied and modified it slightly to work for io-latency. However they are both basically the same thing, and as time has gone on wbt_wait() has ended up much smarter and kinder than

[PATCH 2/3] block: convert wbt_wait() to use rq_qos_wait()

2018-12-04 Thread Josef Bacik
Now that we have rq_qos_wait() in place, convert wbt_wait() over to using it with it's specific callbacks. Signed-off-by: Josef Bacik --- block/blk-wbt.c | 65 ++--- 1 file changed, 11 insertions(+), 54 deletions(-) diff --git

[PATCH 3/3] block: convert io-latency to use rq_qos_wait

2018-12-04 Thread Josef Bacik
Now that we have this common helper, convert io-latency over to use it as well. Signed-off-by: Josef Bacik --- block/blk-iolatency.c | 31 --- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Keith Busch
On Tue, Dec 04, 2018 at 09:38:29AM -0800, Sagi Grimberg wrote: > > > > > Yes, I'm very much in favour of this, too. > > > > We always have this IMO slightly weird notion of stopping the queue, set > > > > some error flags in the driver, then _restarting_ the queue, just so > > > > that the driver

[PATCH 1/3] blktests: add cgroup2 infrastructure

2018-12-04 Thread Josef Bacik
In order to test io.latency and other cgroup related things we need some supporting helpers to setup and tear down cgroup2. This adds support for checking that we can even configure cgroup2 things, set them up if need be, and then add the cleanup stuff to the main cleanup function so everything

[PATCH 0/3] io.latency test for blktests

2018-12-04 Thread Josef Bacik
This patchset is to add a test to verify io.latency is working properly, and to add all the supporting code to run that test. First is the cgroup2 infrastructure which is fairly straightforward. Just verifies we have cgroup2, and gives us the helpers to check and make sure we have the right

[PATCH 3/3] blktests: block/025: an io.latency test

2018-12-04 Thread Josef Bacik
This is a test to verify io.latency is working properly. It does this by first running a fio job by itself to figure out how fast it runs. Then we calculate some thresholds, set up 2 cgroups, a fast and a slow cgroup, and then run the same job in both groups simultaneously. We should see the

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Sagi Grimberg
Yes, I'm very much in favour of this, too. We always have this IMO slightly weird notion of stopping the queue, set some error flags in the driver, then _restarting_ the queue, just so that the driver then sees the error flag and terminates the requests. Which I always found quite

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Sagi Grimberg
Yes, I'm very much in favour of this, too. We always have this IMO slightly weird notion of stopping the queue, set some error flags in the driver, then _restarting_ the queue, just so that the driver then sees the error flag and terminates the requests. Which I always found quite

Re: [PATCH 10/13] nvme-mpath: remove I/O polling support

2018-12-04 Thread Sagi Grimberg
If it really becomes an issue we should rework the nvme code to also skip the multipath code for any private namespace, even if that could mean some trouble when rescanning. This requires some explanation? skip the multipath code how? We currently always go through the multipath node as

Re: [PATCH 05/13] nvme-pci: consolidate code for polling non-dedicated queues

2018-12-04 Thread Sagi Grimberg
+static int nvme_poll_irqdisable(struct nvme_queue *nvmeq, unsigned int tag) Do we still need to carry the tag around? Yes, the timeout handler polls for a specific tag. Does it have to? the documentation suggests that we missed an interrupt, so it is probably waiting on the completion

Re: [PATCH 02/13] nvme-pci: use atomic bitops to mark a queue enabled

2018-12-04 Thread Sagi Grimberg
@@ -2173,6 +2157,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) if (nr_io_queues == 0) return 0; + + clear_bit(NVMEQ_ENABLED, >flags); This is a change of behavior, looks correct though as we can fail nvme_setup_irqs after we freed the

Re: [PATCH 01/13] block: move queues types to the block layer

2018-12-04 Thread Sagi Grimberg
Nit, there seems to be an extra newline that can be omitted here before the else if statement (if I'm reading this correctly)... Empty lines can always be ommited, but in this case I actually like it as it seems to help readability.. If you think its useful I'm fine with it as is...

RE: [PATCH] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

2018-12-04 Thread Kashyap Desai
> On Tue, Dec 04, 2018 at 03:30:11PM +0530, Kashyap Desai wrote: > > Problem statement : > > Whenever try to get outstanding request via scsi_host_find_tag, > > block layer will return stale entries instead of actual outstanding > > request. Kernel panic if stale entry is inaccessible or memory is

Re: [PATCH 02/27] aio: clear IOCB_HIPRI

2018-12-04 Thread Jens Axboe
On 12/4/18 7:46 AM, Christoph Hellwig wrote: > On Fri, Nov 30, 2018 at 10:14:31AM -0700, Jens Axboe wrote: >> On 11/30/18 10:13 AM, Christoph Hellwig wrote: >>> I think we'll need to queue this up for 4.21 ASAP independent of the >>> rest, given that with separate poll queues userspace could

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread James Smart
On 12/4/2018 7:46 AM, Keith Busch wrote: On Mon, Dec 03, 2018 at 05:33:06PM -0800, Sagi Grimberg wrote: Yes, I'm very much in favour of this, too. We always have this IMO slightly weird notion of stopping the queue, set some error flags in the driver, then _restarting_ the queue, just so that

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

2018-12-04 Thread Keith Busch
On Mon, Dec 03, 2018 at 05:33:06PM -0800, Sagi Grimberg wrote: > >> > > Yes, I'm very much in favour of this, too. > > We always have this IMO slightly weird notion of stopping the queue, set > > some error flags in the driver, then _restarting_ the queue, just so > > that the driver then sees

Re: [PATCH v6 0/2] arm64: provide a NEON-accelerated XOR algorithm extension

2018-12-04 Thread Christoph Hellwig
On Tue, Dec 04, 2018 at 04:02:36PM +0100, Ard Biesheuvel wrote: > On Tue, 4 Dec 2018 at 16:01, Christoph Hellwig wrote: > > > > Why does this go to linux-block? > > Because xor_blocks() is part of the RAID driver? The only caller of xor_blocks() seems btrfs. And the RAID code has its own list

Re: [PATCH 10/27] aio: don't zero entire aio_kiocb aio_get_req()

2018-12-04 Thread Jens Axboe
On 12/4/18 7:49 AM, Christoph Hellwig wrote: >> -req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); >> -if (unlikely(!req)) >> -return NULL; >> +req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); >> +if (req) { >> +percpu_ref_get(>reqs); >> +

Re: [PATCH 24/27] block: implement bio helper to add iter kvec pages to bio

2018-12-04 Thread Jens Axboe
On 12/4/18 7:55 AM, Christoph Hellwig wrote: > On Fri, Nov 30, 2018 at 07:21:02PM +, Al Viro wrote: >> On Fri, Nov 30, 2018 at 09:56:43AM -0700, Jens Axboe wrote: >>> For an ITER_KVEC, we can just iterate the iov and add the pages >>> to the bio directly. >> >>> + page =

Re: [PATCH 06/13] nvme-pci: refactor nvme_disable_io_queues

2018-12-04 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 05:00:59PM -0800, Sagi Grimberg wrote: > >> @@ -2428,7 +2426,8 @@ static void nvme_dev_disable(struct nvme_dev *dev, >> bool shutdown) >> nvme_stop_queues(>ctrl); >> if (!dead && dev->ctrl.queue_count > 0) { >> -nvme_disable_io_queues(dev); >> +

Re: [PATCH 10/13] nvme-mpath: remove I/O polling support

2018-12-04 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 05:11:43PM -0800, Sagi Grimberg wrote: >> If it really becomes an issue we >> should rework the nvme code to also skip the multipath code for any >> private namespace, even if that could mean some trouble when rescanning. >> > > This requires some explanation? skip the

Re: [PATCH v6 0/2] arm64: provide a NEON-accelerated XOR algorithm extension

2018-12-04 Thread Will Deacon
On Tue, Dec 04, 2018 at 07:01:24AM -0800, Christoph Hellwig wrote: > Why does this go to linux-block? FWIW, I'll take this via arm64 so feel free to ignore. Will

Re: [PATCH 05/13] nvme-pci: consolidate code for polling non-dedicated queues

2018-12-04 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 04:58:25PM -0800, Sagi Grimberg wrote: > >> +static int nvme_poll_irqdisable(struct nvme_queue *nvmeq, unsigned int tag) > > Do we still need to carry the tag around? Yes, the timeout handler polls for a specific tag.

Re: [PATCH v6 0/2] arm64: provide a NEON-accelerated XOR algorithm extension

2018-12-04 Thread Ard Biesheuvel
On Tue, 4 Dec 2018 at 16:01, Christoph Hellwig wrote: > > Why does this go to linux-block? Because xor_blocks() is part of the RAID driver?

Re: [PATCH 02/13] nvme-pci: use atomic bitops to mark a queue enabled

2018-12-04 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 04:54:15PM -0800, Sagi Grimberg wrote: > >> @@ -2173,6 +2157,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) >> if (nr_io_queues == 0) >> return 0; >> + >> +clear_bit(NVMEQ_ENABLED, >flags); >> > > This is a change of behavior, looks

Re: [PATCH v6 0/2] arm64: provide a NEON-accelerated XOR algorithm extension

2018-12-04 Thread Christoph Hellwig
Why does this go to linux-block?

Re: [PATCH 01/13] block: move queues types to the block layer

2018-12-04 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 04:49:56PM -0800, Sagi Grimberg wrote: >> @@ -103,12 +101,17 @@ static inline struct blk_mq_hw_ctx >> *blk_mq_map_queue(struct request_queue *q, >> unsigned int flags, >>

Re: [PATCH 24/27] block: implement bio helper to add iter kvec pages to bio

2018-12-04 Thread Christoph Hellwig
On Fri, Nov 30, 2018 at 07:21:02PM +, Al Viro wrote: > On Fri, Nov 30, 2018 at 09:56:43AM -0700, Jens Axboe wrote: > > For an ITER_KVEC, we can just iterate the iov and add the pages > > to the bio directly. > > > + page = virt_to_page(kv->iov_base); > > + size =

Re: [PATCH 12/27] aio: use iocb_put() instead of open coding it

2018-12-04 Thread Christoph Hellwig
On Fri, Nov 30, 2018 at 09:56:31AM -0700, Jens Axboe wrote: > Replace the percpu_ref_put() + kmem_cache_free() with a call to > iocb_put() instead. > > Signed-off-by: Jens Axboe Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH 11/27] aio: only use blk plugs for > 2 depth submissions

2018-12-04 Thread Christoph Hellwig
On Fri, Nov 30, 2018 at 09:56:30AM -0700, Jens Axboe wrote: > Plugging is meant to optimize submission of a string of IOs, if we don't > have more than 2 being submitted, don't bother setting up a plug. > > Signed-off-by: Jens Axboe Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH 10/27] aio: don't zero entire aio_kiocb aio_get_req()

2018-12-04 Thread Christoph Hellwig
> - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); > - if (unlikely(!req)) > - return NULL; > + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); > + if (req) { > + percpu_ref_get(>reqs); > + req->ki_ctx = ctx; > +

Re: [PATCH] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

2018-12-04 Thread Bart Van Assche
On 12/4/18 2:00 AM, Kashyap Desai wrote: Problem statement : Whenever try to get outstanding request via scsi_host_find_tag, block layer will return stale entries instead of actual outstanding request. Kernel panic if stale entry is inaccessible or memory is reused. Fix : Undo request mapping in

Re: [PATCH 05/27] block: ensure that async polled IO is marked REQ_NOWAIT

2018-12-04 Thread Christoph Hellwig
On Fri, Nov 30, 2018 at 10:17:49AM -0700, Jens Axboe wrote: > > Setting REQ_NOWAIT from inside the block layer will make the code that > > submits requests harder to review. Have you considered to make this code > > fail I/O if REQ_NOWAIT has not been set and to require that the context > > that

Re: [PATCH 02/27] aio: clear IOCB_HIPRI

2018-12-04 Thread Christoph Hellwig
On Fri, Nov 30, 2018 at 10:14:31AM -0700, Jens Axboe wrote: > On 11/30/18 10:13 AM, Christoph Hellwig wrote: > > I think we'll need to queue this up for 4.21 ASAP independent of the > > rest, given that with separate poll queues userspace could otherwise > > submit I/O that will never get polled

Re: [PATCH v5 0/5] lightnvm: Flexible metadata

2018-12-04 Thread Hans Holmberg
On Mon, Dec 3, 2018 at 9:51 AM Hans Holmberg wrote: > > Great! The tests(rocksdb, pblk recovery and the generic xfs suite) > completed successfully on one of our disks, so feel free to add: > > Tested-by: Hans Holmberg > The tests completed fine on our hardware (which has metadata support), but

Re: [PATCH] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

2018-12-04 Thread Ming Lei
On Tue, Dec 04, 2018 at 03:30:11PM +0530, Kashyap Desai wrote: > Problem statement : > Whenever try to get outstanding request via scsi_host_find_tag, > block layer will return stale entries instead of actual outstanding > request. Kernel panic if stale entry is inaccessible or memory is reused. >

[PATCH] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

2018-12-04 Thread Kashyap Desai
Problem statement : Whenever try to get outstanding request via scsi_host_find_tag, block layer will return stale entries instead of actual outstanding request. Kernel panic if stale entry is inaccessible or memory is reused. Fix : Undo request mapping in blk_mq_put_driver_tag nce request is