Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio

2018-01-30 Thread Ming Lei
On Tue, Jan 30, 2018 at 04:24:14PM +0100, Jiri Palecek wrote: > > On 1/30/18 1:53 PM, Ming Lei wrote: > > On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček wrote: > > > Avoids page leak from bounced requests > > > --- > > > block/blk-map.c | 3 ++- > > > 1 file changed, 2

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread jianchao.wang
Hi Jens On 01/30/2018 11:57 PM, Jens Axboe wrote: > On 1/30/18 8:41 AM, Jens Axboe wrote: >> Hi, >> >> I just hit this on 4.15+ on the laptop, it's running Linus' git >> as of yesterday, right after the block tree merge: >> >> commit 0a4b6e2f80aad46fb55a5cf7b1664c0aef030ee0 >> Merge: 9697e9da8429

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Ming Lei
On Tue, Jan 30, 2018 at 08:22:27PM -0700, Jens Axboe wrote: > On 1/30/18 8:21 PM, Bart Van Assche wrote: > > On Tue, 2018-01-30 at 20:17 -0700, Jens Axboe wrote: > >> BLK_STS_RESOURCE should always be safe to return, and it should work > >> the same as STS_DEV_RESOURCE, except it may cause an

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Jens Axboe
On 1/30/18 8:27 PM, Bart Van Assche wrote: > On Tue, 2018-01-30 at 20:22 -0700, Jens Axboe wrote: >> On 1/30/18 8:21 PM, Bart Van Assche wrote: >>> On Tue, 2018-01-30 at 20:17 -0700, Jens Axboe wrote: BLK_STS_RESOURCE should always be safe to return, and it should work the same as

[PATCH] bcache: fix error return value in memory shrink

2018-01-30 Thread tang . junhui
From: Tang Junhui In bch_mca_scan(), the return value should not be the number of freed btree nodes, but the number of pages of freed btree nodes. Signed-off-by: Tang Junhui --- drivers/md/bcache/btree.c | 2 +- 1 file changed, 1 insertion(+), 1

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 20:22 -0700, Jens Axboe wrote: > On 1/30/18 8:21 PM, Bart Van Assche wrote: > > On Tue, 2018-01-30 at 20:17 -0700, Jens Axboe wrote: > > > BLK_STS_RESOURCE should always be safe to return, and it should work > > > the same as STS_DEV_RESOURCE, except it may cause an extra

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Jens Axboe
On 1/30/18 8:21 PM, Bart Van Assche wrote: > On Tue, 2018-01-30 at 20:17 -0700, Jens Axboe wrote: >> BLK_STS_RESOURCE should always be safe to return, and it should work >> the same as STS_DEV_RESOURCE, except it may cause an extra queue >> run. >> >> Well written drivers should use

Re: [PATCH v6] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Jens Axboe
On 1/30/18 8:04 PM, Mike Snitzer wrote: > From: Ming Lei > > This status is returned from driver to block layer if device related > resource is unavailable, but driver can guarantee that IO dispatch > will be triggered in future when the resource is available. > > Convert

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Jens Axboe
On 1/30/18 10:52 AM, Bart Van Assche wrote: > On 01/30/18 06:24, Mike Snitzer wrote: >> + * >> + * If driver returns BLK_STS_RESOURCE and SCHED_RESTART >> + * bit is set, run queue after a delay to avoid IO stalls >> + * that could otherwise occur if

[PATCH v6] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Mike Snitzer
From: Ming Lei This status is returned from driver to block layer if device related resource is unavailable, but driver can guarantee that IO dispatch will be triggered in future when the resource is available. Convert some drivers to return BLK_STS_DEV_RESOURCE. Also, if

Re: [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Jens Axboe
On 1/30/18 7:24 AM, Mike Snitzer wrote: > From: Ming Lei > > This status is returned from driver to block layer if device related > resource is unavailable, but driver can guarantee that IO dispatch > will be triggered in future when the resource is available. > > Convert

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Ming Lei
On Tue, Jan 30, 2018 at 09:52:31AM -0800, Bart Van Assche wrote: > On 01/30/18 06:24, Mike Snitzer wrote: > > +* > > +* If driver returns BLK_STS_RESOURCE and SCHED_RESTART > > +* bit is set, run queue after a delay to avoid IO stalls > > +* that

[PATCH 5/5] lightnvm: pblk: refactor bad block identification

2018-01-30 Thread Javier González
In preparation for the OCSSD 2.0 spec. bad block identification, refactor the current code to generalize bad block get/set functions and structures. Signed-off-by: Javier González --- drivers/lightnvm/pblk-init.c | 213 +++

[PATCH 2/5] lightnvm: pblk: check data lines version on recovery

2018-01-30 Thread Javier González
From: Hans Holmberg As a preparation for future bumps of data line persistent storage versions, we need to start checking the emeta line version during recovery. Also slit up the current emeta/smeta version into two bytes (major,minor). Recovering lines with the same

Re: [PATCH 1/2] lightnvm: remove mlc pairs structure

2018-01-30 Thread Javier González
> On 30 Jan 2018, at 21.26, Matias Bjørling wrote: > > The known implementations of the 1.2 specification, and upcoming 2.0 > implementation all expose a sequential list of pages to write. > Remove the data structure, as it is no longer needed. > > Signed-off-by: Matias

Re: [PATCH] lightnvm: remove chnl_offset in nvme_nvm_identity

2018-01-30 Thread Javier González
> On 30 Jan 2018, at 22.30, Matias Bjørling wrote: > > The identity structure is initialized to zero in the beginning of > the nvme_nvm_identity function. The chnl_offset is separately set to > zero. Since both the variable and assignment is never changed, remove > them. > >

Re: [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs

2018-01-30 Thread Joseph Qi
Hi Bart, Thanks very much for the quick response. On 18/1/31 05:19, Bart Van Assche wrote: > On Tue, 2018-01-30 at 19:21 +0800, Joseph Qi wrote: >> Hi Jens and Folks, >> >> Recently we've gotten a hard LOCKUP issue. After investigating the issue >> we've found a race between blk_init_queue_node

Re: BUG: unable to handle kernel NULL pointer dereference in blk_throtl_update_limit_valid

2018-01-30 Thread Eric Biggers
On Tue, Dec 19, 2017 at 06:42:00AM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 6084b576dca2e898f5c101baef151f7bfdbb606d > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console

Re: [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 19:21 +0800, Joseph Qi wrote: > Hi Jens and Folks, > > Recently we've gotten a hard LOCKUP issue. After investigating the issue > we've found a race between blk_init_queue_node and blkcg_print_blkgs. > The race is described below. > > blk_init_queue_node

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Jens Axboe
On 1/30/18 1:49 PM, Keith Busch wrote: > On Tue, Jan 30, 2018 at 01:32:25PM -0700, Jens Axboe wrote: >> On 1/30/18 1:30 PM, Keith Busch wrote: >>> On Tue, Jan 30, 2018 at 08:57:49AM -0700, Jens Axboe wrote: Looking at the disassembly, 'n' is 2 and 'segments' is 0x. >>> >>> Is this

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Keith Busch
On Tue, Jan 30, 2018 at 01:32:25PM -0700, Jens Axboe wrote: > On 1/30/18 1:30 PM, Keith Busch wrote: > > On Tue, Jan 30, 2018 at 08:57:49AM -0700, Jens Axboe wrote: > >> > >> Looking at the disassembly, 'n' is 2 and 'segments' is 0x. > > > > Is this still a problem if you don't use an IO

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Jens Axboe
On 1/30/18 1:30 PM, Keith Busch wrote: > On Tue, Jan 30, 2018 at 08:57:49AM -0700, Jens Axboe wrote: >> >> Looking at the disassembly, 'n' is 2 and 'segments' is 0x. > > Is this still a problem if you don't use an IO scheduler? With deadline, > I'm not finding any path to

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Keith Busch
On Tue, Jan 30, 2018 at 08:57:49AM -0700, Jens Axboe wrote: > > Looking at the disassembly, 'n' is 2 and 'segments' is 0x. Is this still a problem if you don't use an IO scheduler? With deadline, I'm not finding any path to bio_attempt_discard_merge which is where the nr_phys_segments is

Re: [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Mike Snitzer
On Tue, Jan 30 2018 at 2:42pm -0500, Bart Van Assche wrote: > On Tue, 2018-01-30 at 14:33 -0500, Mike Snitzer wrote: > > On Tue, Jan 30 2018 at 12:52pm -0500, > > Bart Van Assche wrote: > > > > > - This patch does not fix any bugs nor makes

Re: [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On Tue, 2018-01-30 at 14:33 -0500, Mike Snitzer wrote: > On Tue, Jan 30 2018 at 12:52pm -0500, > Bart Van Assche wrote: > > > - This patch does not fix any bugs nor makes block drivers easier to > > read or to implement. So why is this patch considered useful? > > It

Re: [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Mike Snitzer
On Tue, Jan 30 2018 at 12:52pm -0500, Bart Van Assche wrote: > On 01/30/18 06:24, Mike Snitzer wrote: > >+ * > >+ * If driver returns BLK_STS_RESOURCE and SCHED_RESTART > >+ * bit is set, run queue after a delay to avoid IO stalls > >+

Re: [dm-devel] [LSF/MM TOPIC] block, dm: restack queue_limits

2018-01-30 Thread Ewan D. Milne
On Tue, 2018-01-30 at 16:07 +0100, Hannes Reinecke wrote: > On 01/29/2018 10:08 PM, Mike Snitzer wrote: > > We currently don't restack the queue_limits if the lowest, or > > intermediate, layer of an IO stack changes. > > > > This is particularly unfortunate in the case of FLUSH/FUA which may > >

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Laurence Oberman
On Tue, 2018-01-30 at 09:52 -0800, Bart Van Assche wrote: > On 01/30/18 06:24, Mike Snitzer wrote: > > +  * > > +  * If driver returns BLK_STS_RESOURCE and > > SCHED_RESTART > > +  * bit is set, run queue after a delay to avoid IO > > stalls > > +  * that

Re: [dm-devel] [PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Bart Van Assche
On 01/30/18 06:24, Mike Snitzer wrote: +* +* If driver returns BLK_STS_RESOURCE and SCHED_RESTART +* bit is set, run queue after a delay to avoid IO stalls +* that could otherwise occur if the queue is idle. */ -

Re: [LSF/MM TOPIC] De-clustered RAID with MD

2018-01-30 Thread Wol's lists
On 30/01/18 11:24, NeilBrown wrote: On Tue, Jan 30 2018, Wols Lists wrote: On 29/01/18 21:50, NeilBrown wrote: By doing declustered parity you can sanely do raid6 on 100 drives, using a logical stripe size that is much smaller than 100. When recovering a single drive, the 10-groups-of-10

Re: [PATCH v6 1/2] Return bytes transferred for partial direct I/O

2018-01-30 Thread Goldwyn Rodrigues
On 01/29/2018 01:04 PM, Randy Dunlap wrote: > On 01/29/2018 06:57 AM, Goldwyn Rodrigues wrote: >> From: Goldwyn Rodrigues >> > >> diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt >> index 6c00c1e2743f..72e213d62511 100644 >> ---

Re: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Jens Axboe
On 1/30/18 8:41 AM, Jens Axboe wrote: > Hi, > > I just hit this on 4.15+ on the laptop, it's running Linus' git > as of yesterday, right after the block tree merge: > > commit 0a4b6e2f80aad46fb55a5cf7b1664c0aef030ee0 > Merge: 9697e9da8429 796baeeef85a > Author: Linus Torvalds

WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3

2018-01-30 Thread Jens Axboe
Hi, I just hit this on 4.15+ on the laptop, it's running Linus' git as of yesterday, right after the block tree merge: commit 0a4b6e2f80aad46fb55a5cf7b1664c0aef030ee0 Merge: 9697e9da8429 796baeeef85a Author: Linus Torvalds Date: Mon Jan 29 11:51:49 2018 -0800

Re: v4.15 and I/O hang with BFQ

2018-01-30 Thread Paolo Valente
> Il giorno 30 gen 2018, alle ore 15:40, Ming Lei ha > scritto: > > On Tue, Jan 30, 2018 at 03:30:28PM +0100, Oleksandr Natalenko wrote: >> Hi. >> > ... >> systemd-udevd-271 [000] 4.311033: bfq_insert_requests: insert >> rq->0 >> systemd-udevd-271 [000]

Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio

2018-01-30 Thread Jiri Palecek
On 1/30/18 1:53 PM, Ming Lei wrote: On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček wrote: Avoids page leak from bounced requests --- block/blk-map.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-map.c b/block/blk-map.c index

Re: [dm-devel] [LSF/MM TOPIC] block, dm: restack queue_limits

2018-01-30 Thread Hannes Reinecke
On 01/29/2018 10:08 PM, Mike Snitzer wrote: > We currently don't restack the queue_limits if the lowest, or > intermediate, layer of an IO stack changes. > > This is particularly unfortunate in the case of FLUSH/FUA which may > change if/when a HW controller's BBU fails; whereby requiring the

Re: v4.15 and I/O hang with BFQ

2018-01-30 Thread Ming Lei
On Tue, Jan 30, 2018 at 03:30:28PM +0100, Oleksandr Natalenko wrote: > Hi. > ... >systemd-udevd-271 [000] 4.311033: bfq_insert_requests: insert > rq->0 >systemd-udevd-271 [000] ...1 4.311037: blk_mq_do_dispatch_sched: > not get rq, 1 > cfdisk-408 [000]

[PATCH] lightnvm: remove chnl_offset in nvme_nvm_identity

2018-01-30 Thread Matias Bjørling
The identity structure is initialized to zero in the beginning of the nvme_nvm_identity function. The chnl_offset is separately set to zero. Since both the variable and assignment is never changed, remove them. Signed-off-by: Matias Bjørling --- drivers/nvme/host/lightnvm.c |

Re: v4.15 and I/O hang with BFQ

2018-01-30 Thread Oleksandr Natalenko
Hi. 30.01.2018 09:19, Ming Lei wrote: Hi, We knew there is IO hang issue on BFQ over USB-storage wrt. blk-mq, and last time I found it is inside BFQ. You can try the debug patch in the following link[1] to see if it is same with the previous report[1][2]: [1]

[PATCH v5] blk-mq: introduce BLK_STS_DEV_RESOURCE

2018-01-30 Thread Mike Snitzer
From: Ming Lei This status is returned from driver to block layer if device related resource is unavailable, but driver can guarantee that IO dispatch will be triggered in future when the resource is available. Convert some drivers to return BLK_STS_DEV_RESOURCE. Also, if

[PATCH 2/2] lightnvm: remove multiple groups in 1.2 data structure

2018-01-30 Thread Matias Bjørling
Only one id group from the 1.2 specification is supported. Make sure that only the first group is accessible. Signed-off-by: Matias Bjørling --- drivers/nvme/host/lightnvm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/lightnvm.c

[PATCH 1/2] lightnvm: remove mlc pairs structure

2018-01-30 Thread Matias Bjørling
The known implementations of the 1.2 specification, and upcoming 2.0 implementation all expose a sequential list of pages to write. Remove the data structure, as it is no longer needed. Signed-off-by: Matias Bjørling --- drivers/nvme/host/lightnvm.c | 14 +- 1 file

Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio

2018-01-30 Thread Ming Lei
On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček wrote: > Avoids page leak from bounced requests > --- > block/blk-map.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/block/blk-map.c b/block/blk-map.c > index d3a94719f03f..702d68166689 100644 > ---

Re: [LSF/MM TOPIC] De-clustered RAID with MD

2018-01-30 Thread NeilBrown
On Tue, Jan 30 2018, Wols Lists wrote: > On 29/01/18 21:50, NeilBrown wrote: >> By doing declustered parity you can sanely do raid6 on 100 drives, using >> a logical stripe size that is much smaller than 100. >> When recovering a single drive, the 10-groups-of-10 would put heavy load >> on 9

[RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs

2018-01-30 Thread Joseph Qi
Hi Jens and Folks, Recently we've gotten a hard LOCKUP issue. After investigating the issue we've found a race between blk_init_queue_node and blkcg_print_blkgs. The race is described below. blk_init_queue_node blkcg_print_blkgs blk_alloc_queue_node (1) q->queue_lock =

Re: [LSF/MM TOPIC] Two blk-mq related topics

2018-01-30 Thread Mel Gorman
On Tue, Jan 30, 2018 at 11:08:28AM +0100, Johannes Thumshirn wrote: > [+Cc Mel] > Jens Axboe writes: > > On 1/29/18 1:56 PM, James Bottomley wrote: > >> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote: > >> [...] > >>> 2. When to enable SCSI_MQ at default again? > >> > >> I'm

Re: [LSF/MM TOPIC] De-clustered RAID with MD

2018-01-30 Thread Wols Lists
On 29/01/18 21:50, NeilBrown wrote: > By doing declustered parity you can sanely do raid6 on 100 drives, using > a logical stripe size that is much smaller than 100. > When recovering a single drive, the 10-groups-of-10 would put heavy load > on 9 other drives, while the decluster approach puts

Re: [LSF/MM TOPIC] Two blk-mq related topics

2018-01-30 Thread Martin Steigerwald
Ming Lei - 30.01.18, 02:24: > > > SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In > > > V4.13-rc1, it is enabled at default, but later the patch is reverted > > > in V4.13-rc7, and becomes disabled at default too. > > > > > > Now both the original reported PM issue(actually SCSI

v4.15 and I/O hang with BFQ

2018-01-30 Thread Oleksandr Natalenko
Hi, Paolo, Ivan, Ming et al. It looks like I've just encountered the issue Ivan has already described in [1]. Since I'm able to reproduce it reliably in a VM, I'd like to draw more attention to it. First, I'm using v4.15 kernel with all pending BFQ fixes: === 2ad909a300c4 bfq-iosched: don't