Re: [PATCH 2/2] blk-mq: make sure to back-assign the request to rq_map in blk_mq_alloc_request_hctx

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 05:36:21PM +0200, Sagi Grimberg wrote: > Otherwise we won't be able to retrieve the request from > the tag. > > Signed-off-by: Sagi Grimberg > --- > block/blk-mq.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/blk-mq.c b/block/blk-mq.c

Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 09:15:27AM -0700, Jens Axboe wrote: > On 02/27/2017 09:10 AM, Sagi Grimberg wrote: > > > >>> Hm, this may fix the crash, but I'm not sure it'll work as intended. > >>> When we allocate the request, we'll get a reserved scheduler tag, but > >>> then when we go to dispatch

Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 06:10:01PM +0200, Sagi Grimberg wrote: > > > > Hm, this may fix the crash, but I'm not sure it'll work as intended. > > > When we allocate the request, we'll get a reserved scheduler tag, but > > > then when we go to dispatch the request and call > > >

Re: [PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset

2017-02-27 Thread Omar Sandoval
On Mon, Feb 27, 2017 at 05:36:20PM +0200, Sagi Grimberg wrote: > Signed-off-by: Sagi Grimberg > --- > block/blk-mq-sched.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 98c7b061781e..46ca965fff5c

[PATCH v3 1/2] blk-mq: use sbq wait queues instead of restart for driver tags

2017-02-22 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Commit 50e1dab86aa2 ("blk-mq-sched: fix starvation for multiple hardware queues and shared tags") fixed one starvation issue for shared tags. However, we can still get into a situation where we fail to allocate a tag because all tags are alloca

[PATCH v3 2/2] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-22 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() after we dispatch requests left over on our hardware queue dispatch list. This is so we'll go back and dispatch requests from the scheduler. In this case, it's only necessary to r

[PATCH v2 1/2] blk-mq: use sbq wait queues instead of restart for driver tags

2017-02-21 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Commit 50e1dab86aa2 ("blk-mq-sched: fix starvation for multiple hardware queues and shared tags") fixed one starvation issue for shared tags. However, we can still get into a situation where we fail to allocate a tag because all tags are alloca

[PATCH] scsi_transport_sas: fix BSG ioctl memory corruption

2017-02-21 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> The end_device and sas_host devices support BSG ioctls, but the request_queue allocated for them isn't set up to allocate the struct scsi_request payload. This leads to memory corruption in the call to scsi_req_init() in bsg_map_hdr(), since it will

Re: Manual driver binding and unbinding broken for SCSI

2017-02-19 Thread Omar Sandoval
On Fri, Feb 17, 2017 at 04:43:56PM -0800, James Bottomley wrote: > This seems to be related to a 0day test we got on the block tree, > details here: > > http://marc.info/?t=14862406881 > > I root caused the above to something not being released when it should > be, so it looks like you have

[PATCH 2/2] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-17 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() after we dispatch requests left over on our hardware queue dispatch list. This is so we'll go back and dispatch requests from the scheduler. In this case, it's only necessary to r

[PATCH 1/2] blk-mq: use sbq wait queues instead of restart for driver tags

2017-02-17 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Commit 50e1dab86aa2 ("blk-mq-sched: fix starvation for multiple hardware queues and shared tags") fixed one starvation issue for shared tags. However, we can still get into a situation where we fail to allocate a tag because all tags are alloca

Manual driver binding and unbinding broken for SCSI

2017-02-17 Thread Omar Sandoval
Hi, everyone, As per $SUBJECT, I can cause a crash on v4.10-rc8, Jens' block/for-next, and Jan's bdi branch [1] by doing this: # lsscsi [0:0:0:0]diskQEMU QEMU HARDDISK2.5+ /dev/sda # echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind # echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/bind

[PATCH v3] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-15 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() after we dispatch requests left over on our hardware queue dispatch list. This is so we'll go back and dispatch requests from the scheduler. In this case, it's only necessary to r

Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Omar Sandoval
On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote: > While we're at the topic: > > Can't we use the same names for legacy and mq scheduler? > It's quite an unnecessary complication to have > 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline' > for mq. If we could

Re: [PATCH BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

2017-02-13 Thread Omar Sandoval
On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote: > If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq, > or, viceversa, a blk-mq scheduler is chosen for a device using blk, then > that scheduler is set and initialized without any check, driving the > system into

[PATCH] blk-mq-sched: don't hold queue_lock when calling exit_icq

2017-02-10 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> None of the other blk-mq elevator hooks are called with this lock held. Additionally, it can lead to circular locking dependencies between queue_lock and the private scheduler lock. Reported-by: Paolo Valente <paolo.vale...@linaro.org> Signed-

Re: [PATCH] bfq-mq: cause deadlock by executing exit_icq body immediately

2017-02-08 Thread Omar Sandoval
On Wed, Feb 08, 2017 at 11:39:24AM +0100, Paolo Valente wrote: > > > Il giorno 08 feb 2017, alle ore 11:33, Omar Sandoval <osan...@osandov.com> > > ha scritto: > > > > On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote: > >> > >>>

Re: [PATCH] bfq-mq: cause deadlock by executing exit_icq body immediately

2017-02-08 Thread Omar Sandoval
On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote: > > > Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval <osan...@osandov.com> > > ha scritto: > > > > On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote: > >> Hi, > >

Re: [PATCH] bfq-mq: cause deadlock by executing exit_icq body immediately

2017-02-07 Thread Omar Sandoval
On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote: > Hi, > this patch is meant to show that, if the body of the hook exit_icq is > executed > from inside that hook, and not as deferred work, then a circular deadlock > occurs. > > It happens if, on a CPU > - the body of icq_exit

Re: [PATCH v2] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-06 Thread Omar Sandoval
On Mon, Feb 06, 2017 at 01:07:41PM -0700, Jens Axboe wrote: > On 02/06/2017 12:53 PM, Omar Sandoval wrote: > > On Mon, Feb 06, 2017 at 12:39:57PM -0700, Jens Axboe wrote: > >> On 02/06/2017 12:24 PM, Omar Sandoval wrote: > >>> From:

Re: [PATCH v2] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-06 Thread Omar Sandoval
On Mon, Feb 06, 2017 at 12:39:57PM -0700, Jens Axboe wrote: > On 02/06/2017 12:24 PM, Omar Sandoval wrote: > > From: Omar Sandoval <osan...@fb.com> > > > > In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() > > after we dispatch request

[PATCH v2] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-06 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() after we dispatch requests left over on our hardware queue dispatch list. This is so we'll go back and dispatch requests from the scheduler. In this case, it's only necessary to r

Re: [PATCH] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-03 Thread Omar Sandoval
On Fri, Feb 03, 2017 at 11:35:58AM -0800, Omar Sandoval wrote: > From: Omar Sandoval <osan...@fb.com> > > In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() > after we dispatch requests left over on our hardware queue dispatch > list. This is so we'll

[PATCH] blk-mq-sched: separate mark hctx and queue restart operations

2017-02-03 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() after we dispatch requests left over on our hardware queue dispatch list. This is so we'll go back and dispatch requests from the scheduler. In this case, it's only necessary to r

Re: [PATCH 2/2] block: free merged request in the caller

2017-02-03 Thread Omar Sandoval
d > request. Then we can do it outside of the lock, making it both more > efficient and fixing the blk-mq-sched problem of invoking parts of > the scheduler with an unknown lock state. > > Reported-by: Paolo Valente <paolo.vale...@linaro.org> Reviewed-by: Omar Sandoval <osan...

Re: [PATCH 1/2] blk-merge: return the merged request

2017-02-03 Thread Omar Sandoval
of the merge logic, so that callers can drop > locks before freeing the request. > > There should be no functional changes in this patch. Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-off-by: Jens Axboe <ax...@fb.com> > --- > block/blk-merge.c | 31 +++

[PATCH] blk-mq-sched: bypass the scheduler for flushes entirely

2017-02-02 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> There's a weird inconsistency that flushes are mostly hidden from the scheduler, but it needs to be aware of them in ->insert_requests(). Instead of having every scheduler call blk_mq_sched_bypass_insert(), let's do it in the common framework. S

Re: [PATCH 0/6] block: fix blk-mq debugfs vs. blktrace

2017-02-02 Thread Omar Sandoval
On Thu, Feb 02, 2017 at 11:58:53AM +0100, Greg Kroah-Hartman wrote: > On Wed, Feb 01, 2017 at 12:31:15AM -0800, Omar Sandoval wrote: > > On Wed, Feb 01, 2017 at 09:16:08AM +0100, Greg Kroah-Hartman wrote: > > > On Tue, Jan 31, 2017 at 02:53:16PM -0800, Omar Sandoval wrote:

Re: [PATCH v2] scsi, block: fix duplicate bdi name registration crashes

2017-02-01 Thread Omar Sandoval
add_disk(), and unregister the bdi, blk_cleanup_queue(). > > Thanks to Omar for the quick reproducer script [2]. This patch survives > where an unmodified kernel fails in a few seconds. > > [1]: https://marc.info/?l=linux-scsi=147116857810716=4 > [2]: http://marc.info/?l=linux-block

Re: [PATCH 4/4] blk-mq-debug: Introduce debugfs_create_files()

2017-02-01 Thread Omar Sandoval
to be passed to debugfs_create_files(). Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com> > Cc: Omar Sandoval <osan...@fb.com> > --- > block/blk-mq-debugfs.c | 34 +- > 1 file changed, 17 ins

Re: [PATCH 3/4] blk-mq-debug: Make show() operations interruptible

2017-02-01 Thread Omar Sandoval
On Wed, Feb 01, 2017 at 10:20:58AM -0800, Bart Van Assche wrote: > Allow users to interrupt show operations instead of making a user > space process unkillable if ownership of q->sysfs_lock cannot be > obtained. Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-of

Re: [PATCH 2/4] blk-mq-debug: Avoid that sparse complains about req_flags_t usage

2017-02-01 Thread Omar Sandoval
_sorted > block/elevator.c:541:29:got restricted req_flags_t > > block/blk-mq-debugfs.c:92:54: warning: cast from restricted req_flags_t Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com> > Cc: Omar Sando

Re: [PATCH 0/6] block: fix blk-mq debugfs vs. blktrace

2017-02-01 Thread Omar Sandoval
On Wed, Feb 01, 2017 at 09:16:08AM +0100, Greg Kroah-Hartman wrote: > On Tue, Jan 31, 2017 at 02:53:16PM -0800, Omar Sandoval wrote: > > From: Omar Sandoval <osan...@fb.com> > > > > When I moved the blk-mq debugging information to debugfs, I didn't > > r

[PATCH 5/6] blk-mq: move debugfs_remove() of disk dir to blk_release_queue()

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This needs to happen after we tear down blktrace. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-sysfs.c | 2 +- block/blk-sysfs.c| 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-sysfs.c b/

[PATCH 3/6] blktrace: make do_blk_trace_setup() static

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This isn't used outside of blktrace.c anymore. Fixes: 62c2a7d969f3 ("block: push BKL into blktrace ioctls") Signed-off-by: Omar Sandoval <osan...@fb.com> --- include/linux/blktrace_api.h | 4 kernel/trace/blktrace.c | 6 +++-

[PATCH 6/6] blktrace: use existing disk debugfs directory

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> We may already have a directory to put the blktrace stuff in if 1. The disk uses blk-mq 2. CONFIG_BLK_DEBUG_FS is enabled 3. We are tracing the whole disk and not a partition Instead of hardcoding this very specific case, let's use the new debugfs_

[PATCH 4/6] block: use same block debugfs directory for blk-mq and blktrace

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> When I added the blk-mq debugging information to debugfs, I didn't notice that blktrace also creates a "block" directory in debugfs. Make them use the same dentry, now created in the core block code. Based on a patch from Jens. Signed-off-b

[PATCH 0/6] block: fix blk-mq debugfs vs. blktrace

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> When I moved the blk-mq debugging information to debugfs, I didn't realize that blktrace also created directories in debugfs that conflicted with the blk-mq directories. This series fixes that. Patch 1 adds a new debugfs helper needed for patch 6. Greg,

[PATCH 1/6] debugfs: add debugfs_lookup()

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> We don't always have easy access to the dentry of a file or directory we created in debugfs. Add a helper which allows us to get a dentry we previously created. The motivation for this change is a problem with blktrace and the blk-mq debugfs e

[PATCH 2/6] block: fix debugfs config conditional in struct request_queue

2017-01-31 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> The debugfs dentries are only used for CONFIG_BLK_DEBUG_FS, so make them conditional on that instead of CONFIG_DEBUG_FS. Signed-off-by: Omar Sandoval <osan...@fb.com> --- include/linux/blkdev.h | 2 +- 1 file changed, 1 insertion(+), 1 delet

Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Omar Sandoval
.bottom...@hansenpartnership.com> > > Cc: Bart Van Assche <bart.vanass...@sandisk.com> > > Cc: "Martin K. Petersen" <martin.peter...@oracle.com> > > Cc: Christoph Hellwig <h...@lst.de> > > Cc: Jens Axboe <ax...@kernel.dk> > > Report

[PATCH v2] blk-mq: fix debugfs compilation issues

2017-01-27 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This fixes a couple of problems: 1. In the !CONFIG_DEBUG_FS case, the stub definitions were bogus. 2. In the !CONFIG_BLOCK case, blk-mq-debugfs.c shouldn't be compiled at all. Fix the stub definitions and add a CONFIG_BLK_DEBUG_FS Kconfig option.

[PATCH] blk-mq: fix compilation in !CONFIG_DEBUG_FS case

2017-01-27 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> My bad for not testing it. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq.h | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/block/blk-mq.h b/block/blk-mq.h index 57cdbf6c0cee..f7b41dc5eb60 100644 ---

Re: [PATCH 5/5] blk-mq-sched: change ->dispatch_requests() to ->dispatch_request()

2017-01-26 Thread Omar Sandoval
On Thu, Jan 26, 2017 at 01:59:23PM -0700, Jens Axboe wrote: > On 01/26/2017 01:54 PM, Omar Sandoval wrote: > > On Thu, Jan 26, 2017 at 12:48:18PM -0700, Jens Axboe wrote: > >> When we invoke dispatch_requests(), the scheduler empties everything > >> into the passed

Re: [PATCH 5/5] blk-mq-sched: change ->dispatch_requests() to ->dispatch_request()

2017-01-26 Thread Omar Sandoval
On Thu, Jan 26, 2017 at 12:48:18PM -0700, Jens Axboe wrote: > When we invoke dispatch_requests(), the scheduler empties everything > into the passed in list. This isn't always a good thing, since it > means that we remove items that we could have potentially merged > with. > > Change the function

Re: [PATCH 4/5] blk-mq-sched: fix starvation for multiple hardware queues and shared tags

2017-01-26 Thread Omar Sandoval
have any IO pending on a hardware queue, yet we fail > getting a tag to start new IO. If that happens, it's not enough to > mark the hardware queue as needing a restart, we need to bubble > that up to the higher level queue as well. One minor nit below. Otherwise, makes sense. Reviewed-by

Re: [PATCH 1/5] blk-mq: improve scheduler queue sync/async running

2017-01-26 Thread Omar Sandoval
On Thu, Jan 26, 2017 at 12:48:14PM -0700, Jens Axboe wrote: > We'll use the same criteria for whether we need to run the queue sync > or async when we have a scheduler, as we do without one. Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-off-by: Jens Axboe <ax...@fb.com&g

Re: [PATCH 2/5] blk-mq: fix potential race in queue restart and driver tag allocation

2017-01-26 Thread Omar Sandoval
ue where the needed > IO completes _after_ blk_mq_get_driver_tag() fails, but before we > manage to set the restart bit. Reviewed-by: Omar Sandoval <osan...@fb.com> > Signed-off-by: Jens Axboe <ax...@fb.com> > --- > block/blk-mq.c | 10 +- > 1 file changed, 9 inse

[PATCH v3 05/10] sbitmap: add helpers for dumping to a seq_file

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is useful debugging information that will be used in the blk-mq debugfs directory. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- Jens took offense to me making the bitmap dumps binary,

[PATCH v2 04/10] blk-mq: add extra request information to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> The request pointers by themselves aren't super useful. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --

[PATCH v2 07/10] blk-mq: move tags and sched_tags info from sysfs to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These are very tied to the blk-mq tag implementation, so exposing them to sysfs isn't a great idea. Move the debugging information to debugfs and add basic entries for the number of tags and the number of reserved tags to sysfs. Reviewed-by: Hannes Re

[PATCH v2 06/10] blk-mq: export software queue pending map to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is useful for debugging problems where we've gotten stuck with requests in the software queues. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 21 +++

[PATCH v2 10/10] blk-mq: move hctx and ctx counters from sysfs to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These counters aren't as out-of-place in sysfs as the other stuff, but debugfs is a slightly better home for them. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-

[PATCH v2 09/10] blk-mq: move hctx io_poll, stats, and dispatched from sysfs to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These statistics _might_ be useful to userspace, but it's better not to commit to an ABI for these yet. Also, the dispatched file in sysfs couldn't be cleared, so make it clearable like the others in debugfs. Reviewed-by: Hannes Reinecke <h...

[PATCH v2 02/10] blk-mq: add hctx->{state,flags} to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> hctx->state could come in handy for bugs where the hardware queue gets stuck in the stopped state, and hctx->flags is just useful to know. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk

[PATCH v2 01/10] blk-mq: create debugfs directory tree

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In preparation for putting blk-mq debugging information in debugfs, create a directory tree mirroring the one in sysfs: # tree -d /sys/kernel/debug/block /sys/kernel/debug/block |-- nvme0n1 | `-- mq | |-- 0 | | `-

[PATCH v2 05/10] sbitmap: add helpers for dumping to a seq_file

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is useful debugging information that will be used in the blk-mq debugfs directory. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- include/linux/sbitmap.h | 28

[PATCH v2 08/10] blk-mq: add tags and sched_tags bitmaps to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These can be used to debug issues like tag leaks and stuck requests. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 50

[PATCH v2 00/10] blk-mq: move debugging information from sysfs to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Changes from v1: - Make the sbitmap seq_file helpers take a (struct sbitmap *) instead of a (void *), since it's not possible to use them directly as the seq_file show helper, anyways - Fix a crash when reading ctx_map because it was attempting

[PATCH v2 03/10] blk-mq: move hctx->dispatch and ctx->rq_list from sysfs to debugfs

2017-01-25 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These lists are only useful for debugging; they definitely don't belong in sysfs. Putting them in debugfs also removes the limitation of a single page of output. Reviewed-by: Hannes Reinecke <h...@suse.com> Signed-off-by: Omar Sandoval <

Re: [PATCH 02/10] blk-mq: add hctx->{state,flags} to debugfs

2017-01-25 Thread Omar Sandoval
On Tue, Jan 24, 2017 at 02:25:39PM +0100, Hannes Reinecke wrote: > On 01/23/2017 07:59 PM, Omar Sandoval wrote: > > From: Omar Sandoval <osan...@fb.com> > > > > hctx->state could come in handy for bugs where the hardware queue gets > > stuck in the stopped s

[PATCH] blk-mq-sched: fix possible crash if changing scheduler fails

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In elevator_switch(), we free the old scheduler's tags and then initialize the new scheduler. If initializing the new scheduler fails, we fall back to the old scheduler, but our tags have already been freed. There's no reason to free the sched_tag

[PATCH 02/10] blk-mq: add hctx->{state,flags} to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> hctx->state could come in handy for bugs where the hardware queue gets stuck in the stopped state, and hctx->flags is just useful to know. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk

[PATCH 07/10] blk-mq: move tags and sched_tags info from sysfs to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These are very tied to the blk-mq tag implementation, so exposing them to sysfs isn't a great idea. Move the debugging information to debugfs and add basic entries for the number of tags and the number of reserved tags to sysfs. Signed-off-by: Omar Sa

[PATCH 09/10] blk-mq: move hctx io_poll, stats, and dispatched from sysfs to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These statistics _might_ be useful to userspace, but it's better not to commit to an ABI for these yet. Also, the dispatched file in sysfs couldn't be cleared, so make it clearable like the others in debugfs. Signed-off-by: Omar Sandoval <osan.

[PATCH 10/10] blk-mq: move hctx and ctx counters from sysfs to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These counters aren't as out-of-place in sysfs as the other stuff, but debugfs is a slightly better home for them. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 181 +

[PATCH 04/10] blk-mq: add extra request information to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> The request pointers by themselves aren't super useful. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-deb

[PATCH 06/10] blk-mq: export software queue pending map to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is useful for debugging problems where we've gotten stuck with requests in the software queues. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/

[PATCH 08/10] blk-mq: add tags and sched_tags bitmaps to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These can be used to debug issues like tag leaks and stuck requests. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-debugfs.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/

[PATCH 05/10] sbitmap: add helpers for dumping to a seq_file

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is useful debugging information that will be used in the blk-mq debugfs directory. Signed-off-by: Omar Sandoval <osan...@fb.com> --- include/linux/sbitmap.h | 34 lib/sbitmap.c

[PATCH 03/10] blk-mq: move hctx->dispatch and ctx->rq_list from sysfs to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> These lists are only useful for debugging; they definitely don't belong in sysfs. Putting them in debugfs also removes the limitation of a single page of output. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-deb

[PATCH 00/10] blk-mq: move debugging information from sysfs to debugfs

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This series ends our abuse of sysfs and puts all of the debugging information in debugfs instead. This has a few benefits: 1. Removes the possibility of userspace being stupid and relying on something in sysfs that we only exposed for debugging. 2.

[PATCH 01/10] blk-mq: create debugfs directory tree

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In preparation for putting blk-mq debugging information in debugfs, create a directory tree mirroring the one in sysfs: # tree -d /sys/kernel/debug/block /sys/kernel/debug/block |-- nvme0n1 | `-- mq | |-- 0 | | `-

[PATCH 1/2] sbitmap: use smp_mb__after_atomic() in sbq_wake_up()

2017-01-18 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> We always do an atomic clear_bit() right before we call sbq_wake_up(), so we can use smp_mb__after_atomic(). While we're here, comment the memory barriers in here a little more. Signed-off-by: Omar Sandoval <osan...@fb.com> --- lib/sb

[PATCH 2/2] sbitmap: fix wakeup hang after sbq resize

2017-01-18 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> When we resize a struct sbitmap_queue, we update the wakeup batch size, but we don't update the wait count in the struct sbq_wait_states. If we resized down from a size which could use a bigger batch size, these counts could be too large and cause us t

Re: [PATCH 08/10] blk-mq-sched: add framework for MQ capable IO schedulers

2017-01-13 Thread Omar Sandoval
On Fri, Jan 13, 2017 at 12:15:17PM +0100, Hannes Reinecke wrote: > On 01/11/2017 10:40 PM, Jens Axboe wrote: > > This adds a set of hooks that intercepts the blk-mq path of > > allocating/inserting/issuing/completing requests, allowing > > us to develop a scheduler within that framework. > > > >

Re: [patch] nbd: blk_mq_init_queue returns an error code on failure, not NULL

2017-01-09 Thread Omar Sandoval
ue in > virtio_blk. Compile-tested only. > > Signed-off-by: Jeff Moyer <jmo...@redhat.com> Reviewed-by: Omar Sandoval <osan...@fb.com> Compile-reviewed only :) Josef can probably test it if he cares enough, but it looks right. > diff --git a/drivers/block/nbd.c b/drivers/b

Re: [PATCHSET v4] blk-mq-scheduling framework

2016-12-22 Thread Omar Sandoval
On Thu, Dec 22, 2016 at 04:57:36PM +, Bart Van Assche wrote: > On Thu, 2016-12-22 at 08:52 -0800, Omar Sandoval wrote: > > This approach occurred to us, but we couldn't figure out a way to make > > blk_mq_tag_to_rq() work with it. From skimming over the patches, I > >

Re: [PATCHSET v4] blk-mq-scheduling framework

2016-12-22 Thread Omar Sandoval
On Thu, Dec 22, 2016 at 04:23:24PM +, Bart Van Assche wrote: > On Fri, 2016-12-16 at 17:12 -0700, Jens Axboe wrote: > > From the discussion last time, I looked into the feasibility of having > > two sets of tags for the same request pool, to avoid having to copy > > some of the request fields

[PATCH 1/2] nvme: untangle 0 and BLK_MQ_RQ_QUEUE_OK

2016-11-15 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having specific values. No functional change. Signed-off-by: Omar Sandoval <osan...@fb.com> --- drivers/nvme/host/core.c | 4 ++-- drivers/nvme/host/pci.c| 8 drivers/nvme

[PATCH 2/2] scsi_lib: untangle 0 and BLK_MQ_RQ_QUEUE_OK

2016-11-15 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having specific values. No functional change. Signed-off-by: Omar Sandoval <osan...@fb.com> --- drivers/scsi/scsi_lib.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)

[PATCH] loop: return proper error from loop_queue_rq()

2016-11-14 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> ->queue_rq() should return one of the BLK_MQ_RQ_QUEUE_* constants, not an errno. f4aa4c7bbac6 ("block: loop: convert to per-device workqueue") Signed-off-by: Omar Sandoval <osan...@fb.com> --- drivers/block/loop.c | 2 +- 1 file c

Re: [PATCH 3/3] blk-mq: make the polling code adaptive

2016-11-14 Thread Omar Sandoval
On Fri, Nov 11, 2016 at 10:11:27PM -0700, Jens Axboe wrote: > The previous commit introduced the hybrid sleep/poll mode. Take > that one step further, and use the completion latencies to > automatically sleep for half the mean completion time. This is > a good approximation. > > This changes the

Re: [PATCH 1/3] block: fast-path for small and simple direct I/O requests

2016-11-14 Thread Omar Sandoval
On Fri, Nov 11, 2016 at 10:11:25PM -0700, Jens Axboe wrote: > From: Christoph Hellwig > > This patch adds a small and simple fast patch for small direct I/O > requests on block devices that don't use AIO. Between the neat > bio_iov_iter_get_pages helper that avoids allocating a

Re: [PATCH 2/2] block: fast-path for small and simple direct I/O requests

2016-10-31 Thread Omar Sandoval
On Mon, Oct 31, 2016 at 11:59:25AM -0600, Christoph Hellwig wrote: > This patch adds a small and simple fast patch for small direct I/O > requests on block devices that don't use AIO. Between the neat > bio_iov_iter_get_pages helper that avoids allocating a page array > for get_user_pages and the

Re: Device or HBA level QD throttling creates randomness in sequetial workload

2016-10-26 Thread Omar Sandoval
On Tue, Oct 25, 2016 at 12:24:24AM +0530, Kashyap Desai wrote: > > -Original Message- > > From: Omar Sandoval [mailto:osan...@osandov.com] > > Sent: Monday, October 24, 2016 9:11 PM > > To: Kashyap Desai > > Cc: linux-s...@vger.kernel.org; linux-ker...@

Re: Device or HBA level QD throttling creates randomness in sequetial workload

2016-10-21 Thread Omar Sandoval
On Fri, Oct 21, 2016 at 05:43:35PM +0530, Kashyap Desai wrote: > Hi - > > I found below conversation and it is on the same line as I wanted some > input from mailing list. > > http://marc.info/?l=linux-kernel=147569860526197=2 > > I can do testing on any WIP item as Omar mentioned in above

Re: [RFD] I/O scheduling in blk-mq

2016-10-05 Thread Omar Sandoval
Hey, Paolo, On Wed, Aug 31, 2016 at 05:20:10PM +0200, Paolo Valente wrote: [snip] > > Hi, Paolo, > > > > I've been working on I/O scheduling for blk-mq with Jens for the past > > few months (splitting time with other small projects), and we're making > > good progress. Like you noticed, the hard

Re: [PATCH 0/2]: Add option for async ->queue_rq

2016-09-28 Thread Omar Sandoval
On Tue, Sep 27, 2016 at 05:25:36PM -0700, Bart Van Assche wrote: > On 09/22/16 07:52, Jens Axboe wrote: > > Two patches that add the ability for a driver to flag itself > > as wanting the ->queue_rq() invoked in a manner that allows > > it to block. We'll need that for the nbd conversion, to avoid

Re: [PATCH 04/14] blk-mq: Do not limit number of queues to 'nr_cpu_ids' in allocations

2016-09-20 Thread Omar Sandoval
On Tue, Sep 20, 2016 at 01:44:36PM +0200, Alexander Gordeev wrote: > On Mon, Sep 19, 2016 at 10:48:49AM -0700, Omar Sandoval wrote: > > On Sun, Sep 18, 2016 at 09:37:14AM +0200, Alexander Gordeev wrote: > > > Currently maximum number of used hardware queues is limited to &g

Re: [PATCH 03/14] block: Get rid of unused request_queue::nr_queues member

2016-09-19 Thread Omar Sandoval
On Sun, Sep 18, 2016 at 09:37:13AM +0200, Alexander Gordeev wrote: > CC: linux-block@vger.kernel.org > Signed-off-by: Alexander Gordeev <agord...@redhat.com> Reviewed-by: Omar Sandoval <osan...@fb.com> > --- > block/blk-mq.c | 2 -- > include/linux/blkdev.h

Re: [PATCH 02/14] blk-mq: Fix a potential NULL pointer assignment to hctx tags

2016-09-19 Thread Omar Sandoval
On Sun, Sep 18, 2016 at 09:37:12AM +0200, Alexander Gordeev wrote: > If number of used hardware queues is dynamically decreased > then tags corresponding to the newly unused queues are freed. > > If previously unused hardware queues are then reused again > they will start referring the previously

Re: [PATCH 09/14] blk-mq: Move duplicating code to blk_mq_exit_hctx()

2016-09-19 Thread Omar Sandoval
On Sun, Sep 18, 2016 at 09:37:19AM +0200, Alexander Gordeev wrote: > CC: linux-block@vger.kernel.org > Signed-off-by: Alexander Gordeev > --- > block/blk-mq.c | 14 +- > 1 file changed, 5 insertions(+), 9 deletions(-) > > diff --git a/block/blk-mq.c

Re: [PATCH 10/14] blk-mq: Uninit hardware context in order reverse to init

2016-09-19 Thread Omar Sandoval
On Sun, Sep 18, 2016 at 09:37:20AM +0200, Alexander Gordeev wrote: > CC: linux-block@vger.kernel.org > Signed-off-by: Alexander Gordeev > --- > block/blk-mq.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index

Re: [PATCH 04/14] blk-mq: Do not limit number of queues to 'nr_cpu_ids' in allocations

2016-09-19 Thread Omar Sandoval
On Sun, Sep 18, 2016 at 09:37:14AM +0200, Alexander Gordeev wrote: > Currently maximum number of used hardware queues is limited to > number of CPUs in the system. However, using 'nr_cpu_ids' as > the limit for (de-)allocations of data structures instead of > existing data structures' counters (a)

Re: [PATCH] sbitmap: avoid maybe-uninitialized warning

2016-09-19 Thread Omar Sandoval
On Mon, Sep 19, 2016 at 09:22:34AM -0600, Jens Axboe wrote: > On 09/19/2016 09:14 AM, Arnd Bergmann wrote: > > On Monday, September 19, 2016 8:43:12 AM CEST Jens Axboe wrote: > > > On 09/19/2016 06:33 AM, Arnd Bergmann wrote: > > > > The sbitmap code that has just been turned into a library module

[PATCH v3 4/5] sbitmap: push alloc policy into sbitmap_queue

2016-09-09 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> Again, there's no point in passing this in every time. Make it part of struct sbitmap_queue and clean up the API. Signed-off-by: Omar Sandoval <osan...@fb.com> --- block/blk-mq-tag.c | 33 +++-- block/blk-mq-tag

[PATCH v3 1/5] blk-mq: abstract tag allocation out into sbitmap library

2016-09-09 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is a generally useful data structure, so make it available to anyone else who might want to use it. It's also a nice cleanup separating the allocation logic from the rest of the tag handling logic. The code is behind a new Kconfig option, CONFIG_S

[PATCH v3 0/5] blk-mq: abstract tag allocation out into sbitmap library

2016-09-09 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> This is v3 of the scalable bitmap library derived from blk-mq's tag allocation code. v1 is here [1], v2 is here [2]. Changes in v3: - Renamed scale_bitmap to sbitmap Changes in v2: - Return -EINVAL instead of BUG_ON() if an invalid shift is

[PATCH v3 5/5] sbitmap: randomize initial last_cache values

2016-09-09 Thread Omar Sandoval
From: Omar Sandoval <osan...@fb.com> In order to get good cache behavior from a sbitmap, we want each CPU to stick to its own cacheline(s) as much as possible. This might happen naturally as the bitmap gets filled up and the last_cache values spread out, but we really want this behavio

<    1   2   3   4   5   6   7   >