[PATCH 00/11] Make all concurrent queue flag manipulations safe

2018-02-28 Thread Bart Van Assche
for kernel v4.17. Note: it may be a good idea to postpone patch 11 until after the kernel v4.17 merge window to avoid merge conflicts. Thanks, Bart. Bart Van Assche (11): block: Reorder the queue flag manipulaton function definitions block: Use the queue_flag_*() functions instead of open-coding

[PATCH 04/11] block: Protect queue flag changes with the queue lock

2018-02-28 Thread Bart Van Assche
Since the queue flags may be changed concurrently from multiple contexts after a queue becomes visible in sysfs, make these changes safe by protecting these with the queue lock. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes

[PATCH 01/11] block: Reorder the queue flag manipulaton function definitions

2018-02-28 Thread Bart Van Assche
Move the definition of queue_flag_clear_unlocked() up and move the definition of queue_in_flight() down such that all queue flag manipulation function definitions become contiguous. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com&

[PATCH 02/11] block: Use the queue_flag_*() functions instead of open-coding these

2018-02-28 Thread Bart Van Assche
Except for changing the atomic queue flag manipulations that are protected by the queue lock into non-atomic manipulations, this patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke &

[PATCH 03/11] block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}()

2018-02-28 Thread Bart Van Assche
Introduce functions that modify the queue flags and that protect these modifications with the request queue lock. Except for moving one wake_up_all() call from inside to outside a critical section, this patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.van

[PATCH 10/11] block: Complain if queue_flag_(set|clear)_unlocked() is abused

2018-02-28 Thread Bart Van Assche
Since it is not safe to use queue_flag_(set|clear)_unlocked() without holding the queue lock after the sysfs entries for a queue have been created, complain if this happens. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Mike Snitzer <snit...@redhat.com> Cc: Christop

[PATCH 09/11] block: Use blk_queue_flag_*() in drivers instead of queue_flag_*()

2018-02-28 Thread Bart Van Assche
this patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Martin K. Petersen <martin.peter...@oracle.com> Cc: Mike Snitzer <snit...@redhat.com> Cc: Shaohua Li <s...@fb.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Han

[PATCH 08/11] target/tcm_loop: Use blk_queue_flag_set()

2018-02-28 Thread Bart Van Assche
Use blk_queue_flag_set() instead of open-coding this function. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Nicholas A. Bellinger <n...@linux-iscsi.org> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke <h...@suse.de> Cc: Johannes Thumshirn <jthum

[PATCH 11/11] block: Move the queue_flag_*() functions from a public into a private header file

2018-02-28 Thread Bart Van Assche
This patch helps to avoid that new code gets introduced in block drivers that manipulates queue flags without holding the queue lock when that lock should be held. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke <h

[PATCH 06/11] bcache: Use the blk_queue_flag_{set,clear}() functions

2018-02-28 Thread Bart Van Assche
Use the blk_queue_flag_{set,clear}() functions instead of open-coding these. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Michael Lyle <ml...@lyle.org> Cc: Kent Overstreet <kent.overstr...@gmail.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Re

[PATCH 07/11] iscsi: Use blk_queue_flag_set()

2018-02-28 Thread Bart Van Assche
Use blk_queue_flag_set() instead of open-coding this function. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Martin K. Petersen <martin.peter...@oracle.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke <h...@suse.de> Cc: Johannes Thumshirn <j

Re: [PATCH v5 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-28 Thread Bart Van Assche
On Wed, 2018-02-28 at 11:19 -0700, Jens Axboe wrote: > Didn't Ming ack the first three? Hello Jens, This morning I did what I usually do before I repost a patch series, namely to look at the replies to individual patches for reviewed-by tags. That's how I overlooked the following (see also

[PATCH v5 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-28 Thread Bart Van Assche
the blkcg code and queue cleanup. Changes between v1 and v2: - Split a single patch into two patches. - Dropped blk_alloc_queue_node2() and modified all block drivers that call blk_alloc_queue_node(). Bart Van Assche (6): block/loop: Delete gendisk before cleaning up the request queue md: Delete

[PATCH v5 4/6] block: Add 'lock' as third argument to blk_alloc_queue_node()

2018-02-28 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Hansson <ulf.hans..

[PATCH v5 5/6] block: Fix a race between the cgroup code and request queue initialization

2018-02-28 Thread Bart Van Assche
earlier if necessary. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc:

[PATCH v5 6/6] block: Fix a race between request queue removal and the block cgroup controller

2018-02-28 Thread Bart Van Assche
rence on a request queue after having called blk_cleanup_queue(). Neither driver accesses any of the removed data structures between its blk_cleanup_queue() and blk_put_queue() calls. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.

[PATCH v5 3/6] zram: Delete gendisk before cleaning up the request queue

2018-02-28 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com&g

[PATCH v5 1/6] block/loop: Delete gendisk before cleaning up the request queue

2018-02-28 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com> Cc

[PATCH v5 2/6] md: Delete gendisk before cleaning up the request queue

2018-02-28 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com>

[PATCH] mq-deadline: Make sure to always unlock zones

2018-02-28 Thread Bart Van Assche
: Introduce zone locking support") Signed-off-by: Damien Le Moal <damien.lem...@wdc.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> [ bvanassche: edited patch description ] Cc: Hannes Reinecke <h...@suse.com> Cc: Ming Lei <ming@redhat.com> --- block/mq-

[PATCH 1/2] blk-mq-debugfs: Reorder queue show and store methods

2018-02-27 Thread Bart Van Assche
Make sure that the queue show and store methods are contiguous and also that these appear in alphabetical order. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Omar Sandoval <osan...@fb.com> Cc: Damien Le Moal <damien.lem...@wdc.com> Cc: Ming Lei <ming@r

[PATCH 2/2] blk-mq-debugfs: Show zone locking information

2018-02-27 Thread Bart Van Assche
When debugging the ZBC code in the mq-deadline scheduler it is very important to know which zones are locked and which zones are not locked. Hence this patch that exports the zone locking information through debugfs. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Omar Sandoval

[PATCH 0/2] Make the zone locking information available in debugfs

2018-02-27 Thread Bart Van Assche
this patch series. Please consider this patch series for kernel v4.17. Thanks, Bart. Bart Van Assche (2): blk-mq-debugfs: Reorder queue show and store methods blk-mq-debugfs: Show zone locking information block/blk-mq-debugfs.c | 134 +++-- 1 file

[PATCH] blk-mq: Make sure that the affected zone is unlocked if a request times out

2018-02-27 Thread Bart Van Assche
applies to remains locked forever and no further writes are accepted for that zone. Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support") Signed-off-by: Damien Le Moal <damien.lem...@wdc.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Hannes Reine

Re: [PATCH v4 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-27 Thread Bart Van Assche
On Sat, 2018-02-24 at 20:44 +0800, Ming Lei wrote: > On Thu, Feb 22, 2018 at 05:08:02PM -0800, Bart Van Assche wrote: > > Hello Jens, > > > > Recently Joseph Qi identified races between the block cgroup code and > > request > > queue initialization and c

Re: [PATCH v4 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-27 Thread Bart Van Assche
On Thu, 2018-02-22 at 17:08 -0800, Bart Van Assche wrote: > Recently Joseph Qi identified races between the block cgroup code and request > queue initialization and cleanup. This patch series address these races. > Please > consider these patches for kernel v4.17. Hello Jens,

Re: [PATCH] null_blk: add 'requeue' fault attribute

2018-02-27 Thread Bart Van Assche
the testing arsenal to ensure that we are handling > requeue conditions correctly. > > This works for queue mode 1 (legacy request_fn based path) and 2 (blk-mq > path), as there's no good way to do requeue with a bio based driver. > This is similar to the timeout path. Reviewed-by: B

Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote: > 在 2018/2/26 23:25, Bart Van Assche 写道: > > On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote: > > > When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA > > > disk, then disable t

Re: [PATCH 1/4] block: fix the count of PGPGOUT for WRITE_SAME

2018-02-26 Thread Bart Van Assche
On Mon, 2018-02-26 at 20:04 +0800, Jiufei Xue wrote: > The vm counters is counted in sectors, so we should do the conversation > in submit_bio. > > Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and > partitions index") > > Signed-off-by: Jiufei Xue

Re: [PATCH 1/2] blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch

2018-02-23 Thread Bart Van Assche
Anyway, if you add the following to this patch: Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers") Cc: sta...@vger.kernel.org then you can also add: Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>

Re: [PATCH V2] block: pass inclusive 'lend' parameter to truncate_inode_pages_range

2018-02-23 Thread Bart Van Assche
On Sat, 2018-02-10 at 08:46 +0800, Ming Lei wrote: > The 'lend' parameter of truncate_inode_pages_range is required to be > inclusive, so follow the rule. > > This patch fixes one memory corruption triggered by discard. Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>

Re: v4.16-rc2: I/O hang with dm-rq + Kyber

2018-02-23 Thread Bart Van Assche
On Sat, 2018-02-24 at 00:26 +0800, Ming Lei wrote: > The following 2 patch fixes one IO hang on kyber in my test on USB, could > you test it and see if your case can be fixed? > > https://marc.info/?l=linux-block=151940022831994=2 These two patches are sufficient to make my test pass.

Re: [PATCH 2/2] block: kyber: fix domain token leak during requeue

2018-02-23 Thread Bart Van Assche
c: stable" tag to this patch. Anyway: Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>

Re: [PATCH v4 4/6] block: Add a third argument to blk_alloc_queue_node()

2018-02-23 Thread Bart Van Assche
On Fri, 2018-02-23 at 09:26 +0100, Johannes Thumshirn wrote: > how about "block: Add 'lock' as third argument to > blk_alloc_queue_node()"? > > So one actually sees early enough what the thrird argument will be? Hello Johannes, If I have to repost this patch series I will make that change.

Re: disk-io lockup in 4.14.13 kernel

2018-02-23 Thread Bart Van Assche
On Fri, 2018-02-23 at 11:58 +0200, Jaco Kroon wrote: > On 22/02/2018 18:46, Bart Van Assche wrote: > > (cd /sys/kernel/debug/block && find . -type f -exec grep -aH . {} \;) > > I don't have a /sys/kernel/debug folder - I've enabled CONFIG_DEBUG_FS > and BLK_DEBUG_FS

[PATCH v4 1/6] block/loop: Delete gendisk before cleaning up the request queue

2018-02-22 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Josef Bacik <jba...@fb.com> Cc: Shaohua Li <s...@fb.com> Cc: Omar Sandoval <osan...@fb.com>

[PATCH v4 2/6] md: Delete gendisk before cleaning up the request queue

2018-02-22 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Shaohua Li <s...@kernel.org> --- drivers/md/md.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)

[PATCH v4 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-22 Thread Bart Van Assche
() and modified all block drivers that call blk_alloc_queue_node(). Bart Van Assche (6): block/loop: Delete gendisk before cleaning up the request queue md: Delete gendisk before cleaning up the request queue zram: Delete gendisk before cleaning up the request queue block: Add a third argument

[PATCH v4 6/6] block: Fix a race between request queue removal and the block cgroup controller

2018-02-22 Thread Bart Van Assche
rence on a request queue after having called blk_cleanup_queue(). Neither driver accesses any of the removed data structures between its blk_cleanup_queue() and blk_put_queue() calls. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com&g

[PATCH v4 5/6] block: Fix a race between the cgroup code and request queue initialization

2018-02-22 Thread Bart Van Assche
earlier if necessary. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alibaba.com> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Ha

[PATCH v4 4/6] block: Add a third argument to blk_alloc_queue_node()

2018-02-22 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alibaba.com> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Hansson <ulf.hans...@linar

[PATCH v4 3/6] zram: Delete gendisk before cleaning up the request queue

2018-02-22 Thread Bart Van Assche
Remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Minchan Kim <minc...@kernel.org> Cc: Nitin Gupta <ngu...@vflare.org> Cc: Sergey Senozhatsky <

Re: v4.16-rc2: I/O hang with dm-rq + Kyber

2018-02-22 Thread Bart Van Assche
On Thu, 2018-02-22 at 14:42 -0800, Omar Sandoval wrote: > On Thu, Feb 22, 2018 at 09:10:23PM +0000, Bart Van Assche wrote: > > I/O hangs if I run the following command on top of kernel v4.16-rc2 + the > > ib_srpt patch that adds RDMA/CM support: > > > > srp-test/run_te

v4.16-rc2: I/O hang with dm-rq + Kyber

2018-02-22 Thread Bart Van Assche
Hello Omar, I/O hangs if I run the following command on top of kernel v4.16-rc2 + the ib_srpt patch that adds RDMA/CM support: srp-test/run_tests -c -d -r 10 -t 02-mq -e kyber This does not happen with the deadline scheduler nor without a scheduler. This test passed a few months ago. I have

Re: [PATCH v3 3/3] block: Fix a race between request queue removal and the block cgroup controller

2018-02-22 Thread Bart Van Assche
On Thu, 2018-02-22 at 10:25 +0800, Joseph Qi wrote: > I notice that several devices such as loop and zram will call > blk_cleanup_queue before del_gendisk, so it will hit this warning. Is > this normal? Hello Joseph, Since the disk object has a reference to the queue I agree with Ming that it's

Re: disk-io lockup in 4.14.13 kernel

2018-02-22 Thread Bart Van Assche
On 02/22/18 02:58, Jaco Kroon wrote: We've been seeing sporadic IO lockups on recent kernels. Are you using the legacy I/O stack or blk-mq? If you are not yet using blk-mq, can you switch to blk-mq + scsi-mq + dm-mq? If the lockup is reproducible with blk-mq, can you share the output of the

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-21 Thread Bart Van Assche
On Wed, 2018-02-21 at 11:21 -0800, t...@kernel.org wrote: > Hello, Bart. > > On Wed, Feb 21, 2018 at 06:53:05PM +0000, Bart Van Assche wrote: > > On Sun, 2018-02-18 at 05:11 -0800, t...@kernel.org wrote: > > > On Wed, Feb 14, 2018 at 04:58:56PM +000

Re: [PATCH v3 0/3] Fix races between blkcg code and request queue initialization and cleanup

2018-02-21 Thread Bart Van Assche
On 02/09/18 10:44, Bart Van Assche wrote: Recently Joseph Qi identified races between the block cgroup code and request queue initialization and cleanup. This patch series address these races. Please consider these patches for kernel v4.17. Hello Joseph, Can you add your Tested-by or Reviewed

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-21 Thread Bart Van Assche
On Sun, 2018-02-18 at 05:11 -0800, t...@kernel.org wrote: > On Wed, Feb 14, 2018 at 04:58:56PM +0000, Bart Van Assche wrote: > > With this patch applied the tests I ran so far pass. > > Ah, great to hear. Thanks a lot for testing. Can you please verify > the following? It's

Re: v4.16-rc1 + dm-mpath + BFQ

2018-02-21 Thread Bart Van Assche
On Fri, 2018-02-16 at 08:39 +0100, Paolo Valente wrote: > after enabling the listing options in your list, and a few other > related options, such iblock support, I get this: > > $ sudo ./run_tests -c -d -r 10 -t 02-mq -e bfq > Unloaded the ib_srpt kernel module > Unloaded the rdma_rxe kernel

Re: v4.16-rc1 + dm-mpath + BFQ

2018-02-14 Thread Bart Van Assche
On 02/14/18 09:55, Paolo Valente wrote: After following all of them (and taking some other step needed), I invoked: sudo ./run_tests -c -d -r 10 -t 02-mq -e bfq But I got the following: ./lib/functions: riga 34: /sys/class/block/ram0/size: No such file or directory ./lib/functions: riga 34: *

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-14 Thread Bart Van Assche
On Tue, 2018-02-13 at 13:20 -0800, t...@kernel.org wrote: > On Thu, Feb 08, 2018 at 04:31:43PM +0000, Bart Van Assche wrote: > > The crash is reported at address scsi_times_out+0x17 == scsi_times_out+23. > > The > > instruction at that address tries to dereference s

Re: v4.16-rc1 + dm-mpath + BFQ

2018-02-13 Thread Bart Van Assche
On Tue, 2018-02-13 at 19:38 +0100, Paolo Valente wrote: > as a first attempt, I've followed your steps, but got: > Error: could not find sg_reset Please install the sg3_utils package. Every Linux distro I know of supports that package. And in case you would like to install it from source, the

Re: [PATCH] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into

2018-02-13 Thread Bart Van Assche
On Tue, 2018-02-13 at 17:54 +0900, Sergey Senozhatsky wrote: > On (02/12/18 11:05), Bart Van Assche wrote: > [..] > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index ac4740cf74be..cf17626604c2 100644 > > --- a/include/linux/blkdev.h > >

Re: [PATCH] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into

2018-02-13 Thread Bart Van Assche
On Tue, 2018-02-13 at 09:43 +0100, Johannes Thumshirn wrote: > On Mon, 2018-02-12 at 11:05 -0800, Bart Van Assche wrote: > > +/* > > + * Variables of type sector_t represent an offset or size that is a > > multiple of > > + * 2**9 bytes. Hence these two const

[PATCH] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into

2018-02-12 Thread Bart Van Assche
/ SECTOR_SHIFT / SECTOR_BITS definitions have not been removed from uapi header files nor from NAND drivers in which these constants are used for another purpose than converting block layer offsets and sizes into a number of sectors. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc:

[PATCH] block: Reorder the queue flag manipulaton function definitions

2018-02-12 Thread Bart Van Assche
Move the definition of queue_flag_clear_unlocked() up and move the definition of queue_in_flight() down such that all queue flag manipulation function definitions become contiguous. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> --- i

[PATCH] blk-mq-debugfs: Also show requests that have not yet been started

2018-02-12 Thread Bart Van Assche
ow_rq() NULL pointer dereference"). Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Ming Lei <ming@redhat.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Hannes Reinecke <h...@suse.com> Cc: Johannes Thumshirn <jthumsh...@suse.de> Cc: Martin

Re: vgdisplay hang on iSCSI session

2018-02-12 Thread Bart Van Assche
On 02/05/18 08:01, Jean-Louis Dupond wrote: We've got some "strange" issue on a Xen hypervisor with CentOS 6 and 4.9.63-29.el6.x86_6 kernel. Hello Jean-Louis, Since this behavior was observed with a distro kernel I think a support request should be submitted to the vendor of that kernel.

Re: v4.16-rc1 + dm-mpath + BFQ

2018-02-12 Thread Bart Van Assche
On 02/11/18 23:35, Paolo Valente wrote: Also this smells a little bit like some spurious elevator call. Unfortunately I have no clue on the cause. To go on, I need at least to reproduce it. In this respect: Bart, could you please tell me how to setup the offending configuration, and to cause

Re: [PATCH] blk: optimization for classic polling

2018-02-12 Thread Bart Van Assche
On 05/29/83 20:21, Nitesh Shetty wrote: [ ... ] Hello Nitesh, Can you check the clock of the system you used to send this e-mail? In the header of your e-mail I found the following: Date: Sun, 30 May 2083 09:51:06 +0530 Thanks, Bart.

Re: v4.16-rc1 + dm-mpath + BFQ

2018-02-09 Thread Bart Van Assche
On 02/09/18 10:58, Jens Axboe wrote: On 2/9/18 11:54 AM, Bart Van Assche wrote: Hello Paolo, If I enable the BFQ scheduler for a dm-mpath device then a kernel oops appears (see also below). This happens systematically with Linus' tree from this morning (commit 54ce685cae30) merged with Jens

v4.16-rc1 + dm-mpath + BFQ

2018-02-09 Thread Bart Van Assche
Hello Paolo, If I enable the BFQ scheduler for a dm-mpath device then a kernel oops appears (see also below). This happens systematically with Linus' tree from this morning (commit 54ce685cae30) merged with Jens' for-linus branch (commit a78773906147 ("block, bfq: add requeue-request hook")) and

[PATCH v3 2/3] block: Fix a race between the cgroup code and request queue initialization

2018-02-09 Thread Bart Van Assche
earlier if necessary. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alibaba.com> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Ha

[PATCH v3 1/3] block: Add a third argument to blk_alloc_queue_node()

2018-02-09 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Christoph Hellwig <h...@lst.de> Cc: Joseph Qi <joseph...@linux.alibaba.com> Cc: Philipp Reisner <philipp.reis...@linbit.com> Cc: Ulf Hansson <ulf.hans...@linar

[PATCH v3 0/3] Fix races between blkcg code and request queue initialization and cleanup

2018-02-09 Thread Bart Van Assche
between the blkcg code and queue cleanup. Changes between v1 and v2: - Split a single patch into two patches. - Dropped blk_alloc_queue_node2() and modified all block drivers that call blk_alloc_queue_node(). Bart Van Assche (3): block: Add a third argument to blk_alloc_queue_node() block

[PATCH v3 3/3] block: Fix a race between request queue removal and the block cgroup controller

2018-02-09 Thread Bart Van Assche
rence on a request queue after having called blk_cleanup_queue(). Neither driver accesses any of the removed data structures between its blk_cleanup_queue() and blk_put_queue() calls. Reported-by: Joseph Qi <joseph...@linux.alibaba.com> Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com&g

Re: [PATCH] block: pass inclusive 'lend' parameter to truncate_inode_pages_range

2018-02-09 Thread Bart Van Assche
On Fri, 2018-02-09 at 22:15 +0800, Ming Lei wrote: > The 'lend' parameter of truncate_inode_pages_range is required to be > inclusive, so follow the rule. > > This patch fixes one memory corruption triggered by discard. > > Fixes: 351499a172c0 ("block: Invalidate cache on discard v2") Since

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-08 Thread Bart Van Assche
On Thu, 2018-02-08 at 18:38 +0100, Danil Kipnis wrote: > thanks for the link to the article. To the best of my understanding, > the guys suggest to authenticate the devices first and only then > authenticate the users who use the devices in order to get access to a > corporate service. They also

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-08 Thread Bart Van Assche
On Thu, 2018-02-08 at 09:40 -0800, t...@kernel.org wrote: > Heh, sorry about not being clear. What I'm trying to say is that > scmd->device != NULL && device->host == NULL. Or was this what you > were saying all along? What I agree with is that the request pointer (req argument) is stored in

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-08 Thread Bart Van Assche
On Thu, 2018-02-08 at 09:19 -0800, t...@kernel.org wrote: > Hello, Bart. > > On Thu, Feb 08, 2018 at 05:10:45PM +0000, Bart Van Assche wrote: > > I think "dereferencing a pointer" means reading the memory location that > > pointer points > > at? Anyway, I thi

Re: [PATCH 1/5] blk-mq: tags: define several fields of tags as pointer

2018-02-08 Thread Bart Van Assche
On Sat, 2018-02-03 at 12:21 +0800, Ming Lei wrote: > diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h > index 61deab0b5a5a..a68323fa0c02 100644 > --- a/block/blk-mq-tag.h > +++ b/block/blk-mq-tag.h > @@ -11,10 +11,14 @@ struct blk_mq_tags { > unsigned int nr_tags; > unsigned int

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-08 Thread Bart Van Assche
On Thu, 2018-02-08 at 09:00 -0800, t...@kernel.org wrote: > On Thu, Feb 08, 2018 at 04:31:43PM +0000, Bart Van Assche wrote: > > The crash is reported at address scsi_times_out+0x17 == scsi_times_out+23. > > The > > instruction at that address tries to dereference s

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-08 Thread Bart Van Assche
On Thu, 2018-02-08 at 07:39 -0800, t...@kernel.org wrote: > On Thu, Feb 08, 2018 at 01:09:57AM +0000, Bart Van Assche wrote: > > On Wed, 2018-02-07 at 23:48 +0000, Bart Van Assche wrote: > > > With this patch applied I see requests for which it seems like the > > >

Re: [PATCH V2 2/8] blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS

2018-02-08 Thread Bart Van Assche
On Mon, 2018-02-05 at 23:20 +0800, Ming Lei wrote: > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 55c0a745b427..385bbec73804 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -81,6 +81,17 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx >

Re: [PATCH rfc 3/5] irq_poll: wire up irq_am

2018-02-07 Thread Bart Van Assche
On Tue, 2018-02-06 at 00:03 +0200, Sagi Grimberg wrote: > +void irq_poll_init_am(struct irq_poll *iop, unsigned int nr_events, > +unsigned short nr_levels, unsigned short start_level, irq_poll_am_fn > *amfn) > +{ > + iop->amfn = amfn; > + irq_am_init(>am, nr_events, nr_levels,

Re: [PATCH rfc 2/5] irq-am: add some debugfs exposure on tuning state

2018-02-07 Thread Bart Van Assche
On Tue, 2018-02-06 at 00:03 +0200, Sagi Grimberg wrote: > +static int irq_am_register_debugfs(struct irq_am *am) > +{ > + char name[20]; > + > + snprintf(name, sizeof(name), "am%u", am->id); > + am->debugfs_dir = debugfs_create_dir(name, > +

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 23:48 +, Bart Van Assche wrote: > With this patch applied I see requests for which it seems like the timeout > handler > did not get invoked: [ ... ] I just noticed the following in the system log, which is probably the reason why some requests got stuck: Feb

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 12:07 -0800, t...@kernel.org wrote: > Ah, you're right. u64_stat_sync doesn't imply barriers, so we want > something like the following. > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index df93102..d6edf3b 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 12:09 -0800, t...@kernel.org wrote: > Hello, > > On Wed, Feb 07, 2018 at 07:03:56PM +0000, Bart Van Assche wrote: > > I tried the above patch but already during the first iteration of the test I > > noticed that the test hung, probably due to the follo

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 09:06 -0800, Tejun Heo wrote: > Can you see whether by any chance the following patch fixes the issue? > If not, can you share the repro case? > > Thanks. > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index df93102..651d18c 100644 > --- a/block/blk-mq.c > +++

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 09:35 -0800, t...@kernel.org wrote: > On Wed, Feb 07, 2018 at 05:27:10PM +0000, Bart Van Assche wrote: > > Even with the above change I think that there is still a race between the > > code that handles timer resets and the completion handler. > &

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 18:18 +0100, Roman Penyaev wrote: > So the question is: are there real life setups where > some of the local IB network members can be untrusted? Hello Roman, You may want to read more about the latest evolutions with regard to network security. An article that I can

Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 09:06 -0800, Tejun Heo wrote: > On Tue, Feb 06, 2018 at 05:11:33PM -0800, Bart Van Assche wrote: > > The following race can occur between the code that resets the timer > > and completion handling: > > - The code that handles BLK_EH_RESET_TIMER

Re: [PATCH V2 2/8] blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS

2018-02-07 Thread Bart Van Assche
On Mon, 2018-02-05 at 23:20 +0800, Ming Lei wrote: > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 55c0a745b427..385bbec73804 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -81,6 +81,17 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx >

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-07 Thread Bart Van Assche
On Wed, 2018-02-07 at 13:57 +0100, Roman Penyaev wrote: > On Tue, Feb 6, 2018 at 5:01 PM, Bart Van Assche <bart.vanass...@wdc.com> > wrote: > > On Tue, 2018-02-06 at 14:12 +0100, Roman Penyaev wrote: > > Something else I would like to understand better is how much of the

Re: [PATCH V2 2/8] blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS

2018-02-07 Thread Bart Van Assche
On 02/06/18 15:18, Jens Axboe wrote: GLOBAL implies that it's, strangely enough, global. That isn't really the case. Why not call this BLK_MQ_F_HOST_TAGS or something like that? I'd welcome better names, but global doesn't seem to be a great choice. BLK_MQ_F_SET_TAGS? I like the name

[PATCH] blk-mq-debugfs: Show more request state information

2018-02-06 Thread Bart Van Assche
8 ("blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq") Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Tejun Heo <t...@kernel.org> --- block/blk-mq-debugfs.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-debugfs.c b/block/blk

[PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

2018-02-06 Thread Bart Van Assche
("blk-mq: replace timeout synchronization with a RCU and generation based scheme") Signed-off-by: Bart Van Assche <bart.vanass...@wdc.com> Cc: Tejun Heo <t...@kernel.org> --- block/blk-core.c | 3 +- block/blk-mq-debugfs.c | 1 - block/

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-06 Thread Bart Van Assche
On Tue, 2018-02-06 at 14:12 +0100, Roman Penyaev wrote: > On Mon, Feb 5, 2018 at 1:16 PM, Sagi Grimberg wrote: > > [ ... ] > > - srp/scst comparison is really not fair having it in legacy request > > mode. Can you please repeat it and report a bug to either linux-rdma > > or

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-06 Thread Bart Van Assche
On Tue, 2018-02-06 at 10:44 +0100, Danil Kipnis wrote: > the configuration (which devices can be accessed by a particular > client) can happen also after the kernel target module is loaded. The > directory in is a module parameter and is fixed. It > contains for example "/ibnbd_devices/". But a

Re: [PATCH] blk-mq: Fix a race between resetting the timer and completion handling

2018-02-05 Thread Bart Van Assche
On Mon, 2018-02-05 at 13:06 -0800, Tejun Heo wrote: > Thanks a lot for testing and fixing the issues but I'm a bit confused > by the patch. Maybe we can split patch a bit more? There seem to be > three things going on, > > 1. Changing preemption protection to irq protection in issue path. > >

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-05 Thread Bart Van Assche
On 02/05/18 08:40, Danil Kipnis wrote: It just occurred to me, that we could easily extend the interface in such a way that each client (i.e. each session) would have on server side her own directory with the devices it can access. I.e. instead of just "dev_search_path" per server, any client

Re: [PATCH v2 2/2] block: Fix a race between the throttling code and request queue initialization

2018-02-05 Thread Bart Van Assche
On Sat, 2018-02-03 at 10:51 +0800, Joseph Qi wrote: > Hi Bart, > > On 18/2/3 00:21, Bart Van Assche wrote: > > On Fri, 2018-02-02 at 09:02 +0800, Joseph Qi wrote: > > > We triggered this race when using single queue. I'm not sure if it > > > exists in multi-qu

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-05 Thread Bart Van Assche
On Mon, 2018-02-05 at 18:16 +0100, Roman Penyaev wrote: > Everything (fio jobs, setup, etc) is given in the same link: > > https://www.spinics.net/lists/linux-rdma/msg48799.html > > at the bottom you will find links on google docs with many pages > and archived fio jobs and scripts. (I do not

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-05 Thread Bart Van Assche
On Mon, 2018-02-05 at 14:16 +0200, Sagi Grimberg wrote: > - Your latency measurements are surprisingly high for a null target >device (even for low end nvme device actually) regardless of the >transport implementation. > > For example: > - QD=1 read latency is 648.95 for ibnbd (I assume

Re: [PATCH 05/24] ibtrs: client: main functionality

2018-02-05 Thread Bart Van Assche
On Mon, 2018-02-05 at 15:19 +0100, Roman Penyaev wrote: > On Mon, Feb 5, 2018 at 12:19 PM, Sagi Grimberg wrote: > > Do you actually ever have remote write access in your protocol? > > We do not have reads, instead client writes on write and server writes > on read. (write only

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-05 Thread Bart Van Assche
On Mon, 2018-02-05 at 09:56 +0100, Jinpu Wang wrote: > Hi Bart, > > My another 2 cents:) > On Fri, Feb 2, 2018 at 6:05 PM, Bart Van Assche <bart.vanass...@wdc.com> > wrote: > > On Fri, 2018-02-02 at 15:08 +0100, Roman Pen wrote: > > > o Simple configurat

Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

2018-02-02 Thread Bart Van Assche
On Fri, 2018-02-02 at 15:08 +0100, Roman Pen wrote: > o Simple configuration of IBNBD: >- Server side is completely passive: volumes do not need to be > explicitly exported. That sounds like a security hole? I think the ability to configure whether or not an initiator is allowed to log

Re: [PATCH 05/24] ibtrs: client: main functionality

2018-02-02 Thread Bart Van Assche
On Fri, 2018-02-02 at 15:08 +0100, Roman Pen wrote: > +static inline struct ibtrs_tag * > +__ibtrs_get_tag(struct ibtrs_clt *clt, enum ibtrs_clt_con_type con_type) > +{ > + size_t max_depth = clt->queue_depth; > + struct ibtrs_tag *tag; > + int cpu, bit; > + > + cpu = get_cpu(); >

<    2   3   4   5   6   7   8   9   10   11   >