Re: [PATCH] bio-integrity: revert "stop abusing bi_end_io"

2017-08-05 Thread Martin K. Petersen
Christoph, > We can simply add another bio flag to get back to the previous > behavior. That being said thing to do in the end is to verify it > at the top of the stack, and not the bottom eventuall. I can cook > up a patch for that. Yeah, the original code was careful about only adding the

Re: [PATCH] bio-integrity: revert "stop abusing bi_end_io"

2017-08-05 Thread Martin K. Petersen
Mikulas, > The sector number in the integrity tag must match the physical sector > number. So, it must be verified at the bottom. The ref tag seed matches the submitter block number (typically block layer sector for the top device) and is remapped to and from the LBA by the SCSI disk driver or

Re: [PATCH 5/6] blk-mq: enable checking two part inflight counts at the same time

2017-08-05 Thread Bart Van Assche
On Fri, 2017-08-04 at 13:56 -0600, Jens Axboe wrote: > On 08/04/2017 01:44 PM, Bart Van Assche wrote: > > On Fri, 2017-08-04 at 09:04 -0600, Jens Axboe wrote: > > > @@ -98,11 +98,13 @@ static void blk_mq_check_inflight(struct > > > blk_mq_hw_ctx *hctx, > > > return; > > > > > > /* >

Re: [PATCH 5/5] testb: badblock support

2017-08-05 Thread Dan Williams
On Sat, Aug 5, 2017 at 8:51 AM, Shaohua Li wrote: > From: Shaohua Li > > Sometime disk could have tracks broken and data there is inaccessable, > but data in other parts can be accessed in normal way. MD RAID supports > such disks. But we don't have a good way to

Re: [PATCH 04/14] blk-mq-sched: improve dispatching from sw queue

2017-08-05 Thread h...@infradead.org
On Thu, Aug 03, 2017 at 05:33:13PM +, Bart Van Assche wrote: > Are you aware that the SCSI core already keeps track of the number of busy > requests > per LUN? See also the device_busy member of struct scsi_device. How about > giving the > block layer core access in some way to that counter?

Re: Switching to MQ by default may generate some bug reports

2017-08-05 Thread Mel Gorman
On Sat, Aug 05, 2017 at 12:05:00AM +0200, Paolo Valente wrote: > > > > True. However, the difference between legacy-deadline mq-deadline is > > roughly around the 5-10% mark across workloads for SSD. It's not > > universally true but the impact is not as severe. While this is not > > proof that

Re: [PATCH 04/14] blk-mq-sched: improve dispatching from sw queue

2017-08-05 Thread Ming Lei
On Thu, Aug 03, 2017 at 05:33:13PM +, Bart Van Assche wrote: > On Thu, 2017-08-03 at 11:13 +0800, Ming Lei wrote: > > On Thu, Aug 03, 2017 at 01:35:29AM +, Bart Van Assche wrote: > > > On Wed, 2017-08-02 at 11:31 +0800, Ming Lei wrote: > > > > On Tue, Aug 01, 2017 at 03:11:42PM +, Bart

Re: [PATCH] bio-integrity: revert "stop abusing bi_end_io"

2017-08-05 Thread Christoph Hellwig
On Thu, Aug 03, 2017 at 10:10:55AM -0400, Mikulas Patocka wrote: > That dm-crypt commit that uses bio integrity payload came 3 months before > 7c20f11680a441df09de7235206f70115fbf6290 and it was already present in > 4.12. And on it's own that isn't an argument if your usage is indeed wrong,

[PATCH V2 02/20] sbitmap: introduce __sbitmap_for_each_set()

2017-08-05 Thread Ming Lei
We need to iterate ctx starting from one offset in way of round robin, so introduce this helper. Cc: Omar Sandoval Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 54 - 1 file changed, 40 insertions(+),

[PATCH V2 03/20] blk-mq: introduce blk_mq_dispatch_rq_from_ctx()

2017-08-05 Thread Ming Lei
This function is introduced for dequeuing request from sw queue so that we can dispatch it in scheduler's way. More importantly, for some SCSI devices, driver tags are host wide, and the number is quite big, but each lun has very limited queue depth. This function is introduced for avoiding to

[PATCH V2 09/20] blk-mq: introduce BLK_MQ_F_SHARED_DEPTH

2017-08-05 Thread Ming Lei
SCSI devices often provides one per-request_queue depth via q->queue_depth(.cmd_per_lun), which is a global limit on all hw queues. After the pending I/O submitted to one rquest queue reaches this limit, BLK_STS_RESOURCE will be returned to all dispatch path. That means when one hw queue is stuck,

[PATCH V2 07/20] blk-mq-sched: introduce blk_mq_sched_queue_depth()

2017-08-05 Thread Ming Lei
The following patch will propose some hints to figure out default queue depth for scheduler queue, so introduce helper of blk_mq_sched_queue_depth() for this purpose. Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 8 +---

[PATCH V2 01/20] blk-mq-sched: fix scheduler bad performance

2017-08-05 Thread Ming Lei
When hw queue is busy, we shouldn't take requests from scheduler queue any more, otherwise IO merge will be difficult to do. This patch fixes the awful IO performance on some SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if hw queue is busy. Reviewed-by:

[PATCH V2 10/20] blk-mq-sched: introduce helpers for query, change busy state

2017-08-05 Thread Ming Lei
Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 6 +++--- block/blk-mq.c | 4 ++-- block/blk-mq.h | 15 +++ 3 files changed, 20 insertions(+), 5 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index

[PATCH V2 04/20] blk-mq-sched: move actual dispatching into one helper

2017-08-05 Thread Ming Lei
So that it becomes easy to support to dispatch from sw queue in the following patch. No functional change. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git

[PATCH V2 12/20] blk-mq: introduce pointers to dispatch lock & list

2017-08-05 Thread Ming Lei
Prepare to support per-request-queue dispatch list, so introduce dispatch lock and list for avoiding to do runtime check. Signed-off-by: Ming Lei --- block/blk-mq-debugfs.c | 10 +- block/blk-mq-sched.c | 2 +- block/blk-mq.c | 7 +-- block/blk-mq.h

[PATCH V2 00/20] blk-mq-sched: improve SCSI-MQ performance

2017-08-05 Thread Ming Lei
In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O performance is much bad with mq-deadline, especially about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) Turns out one big issue causes the performance regression: requests are still dequeued from

[PATCH V2 08/20] blk-mq-sched: use q->queue_depth as hint for q->nr_requests

2017-08-05 Thread Ming Lei
SCSI sets q->queue_depth from shost->cmd_per_lun, and q->queue_depth is per request_queue and more related to scheduler queue compared with hw queue depth, which can be shared by queues, such as TAG_SHARED. This patch trys to use q->queue_depth as hint for computing q->nr_requests, which should

[PATCH V2 05/20] blk-mq-sched: improve dispatching from sw queue

2017-08-05 Thread Ming Lei
SCSI devices use host-wide tagset, and the shared driver tag space is often quite big. Meantime there is also queue depth for each lun(.cmd_per_lun), which is often small. So lots of requests may stay in sw queue, and we always flush all belonging to same hw queue and dispatch them all to driver,

[PATCH V2 11/20] blk-mq: introduce helpers for operating ->dispatch list

2017-08-05 Thread Ming Lei
Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 13 +++-- block/blk-mq.c | 24 +++- block/blk-mq.h | 40 3 files changed, 54 insertions(+), 23 deletions(-) diff --git a/block/blk-mq-sched.c

[PATCH V2 06/20] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed

2017-08-05 Thread Ming Lei
During dispatching, we moved all requests from hctx->dispatch to one temporary list, then dispatch them one by one from this list. Unfortunately duirng this period, run queue from other contexts may think the queue is idle, then start to dequeue from sw/scheduler queue and still try to dispatch

[PATCH V2 13/20] blk-mq: pass 'request_queue *' to several helpers of operating BUSY

2017-08-05 Thread Ming Lei
We need to support per-request_queue dispatch list for avoiding early dispatch in case of shared queue depth. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 4 ++-- block/blk-mq.c | 2 +- block/blk-mq.h | 19 +++ 3 files changed, 14

[PATCH V2 16/20] block: move actual bio merge code into __elv_merge

2017-08-05 Thread Ming Lei
So that we can reuse __elv_merge() to merge bio into requests from sw queue in the following patches. Signed-off-by: Ming Lei --- block/elevator.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/block/elevator.c b/block/elevator.c

[PATCH V2 14/20] blk-mq-sched: improve IO scheduling on SCSI devcie

2017-08-05 Thread Ming Lei
SCSI device often has per-request_queue queue depth (.cmd_per_lun), which is applied among all hw queues actually, and this patchset calls this as shared queue depth. One theory of scheduler is that we shouldn't dequeue request from sw/scheduler queue and dispatch it to driver when the low level

[PATCH V2 19/20] blk-mq-sched: refactor blk_mq_sched_try_merge()

2017-08-05 Thread Ming Lei
This patch introduces one function __blk_mq_try_merge() which will be resued for bio merge to sw queue in the following patch. No functional change. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-)

[PATCH V2 18/20] block: introduce .last_merge and .hash to blk_mq_ctx

2017-08-05 Thread Ming Lei
Prepare for supporting bio merge to sw queue if no blk-mq io scheduler is taken. Signed-off-by: Ming Lei --- block/blk-mq.h | 4 block/blk.h | 3 +++ block/elevator.c | 22 +++--- 3 files changed, 26 insertions(+), 3 deletions(-) diff --git

[PATCH V2 20/20] blk-mq: improve bio merge from blk-mq sw queue

2017-08-05 Thread Ming Lei
This patch uses hash table to do bio merge from sw queue, then we can align to blk-mq scheduler/block legacy's way for bio merge. Turns out bio merge via hash table is more efficient than simple merge on the last 8 requests in sw queue. On SCSI SRP, it is observed ~10% IOPS is increased in

[PATCH V2 17/20] block: add check on elevator for supporting bio merge via hashtable from blk-mq sw queue

2017-08-05 Thread Ming Lei
blk_mq_sched_try_merge() will be reused in following patches to support bio merge to blk-mq sw queue, so add checkes to related functions which are called from blk_mq_sched_try_merge(). Signed-off-by: Ming Lei --- block/elevator.c | 16 1 file changed, 16

[PATCH V2 15/20] block: introduce rqhash helpers

2017-08-05 Thread Ming Lei
We need this helpers for supporting to use hashtable to improve bio merge from sw queue in the following patches. No functional change. Signed-off-by: Ming Lei --- block/blk.h | 52 block/elevator.c | 36

Re: [PATCH] bio-integrity: revert "stop abusing bi_end_io"

2017-08-05 Thread Mikulas Patocka
On Sat, 5 Aug 2017, Christoph Hellwig wrote: > > On Thu, Aug 03, 2017 at 10:10:55AM -0400, Mikulas Patocka wrote: > > That dm-crypt commit that uses bio integrity payload came 3 months before > > 7c20f11680a441df09de7235206f70115fbf6290 and it was already present in > > 4.12. > > And on

[PATCH 2/5] testb: implement block device operations

2017-08-05 Thread Shaohua Li
From: Kyungchan Koh This create/remove disk when user writes 1/0 to 'power' attribute. Signed-off-by: Kyungchan Koh Signed-off-by: Shaohua Li --- drivers/block/test_blk.c | 539 ++- 1 file changed, 538

[PATCH 1/5] testb: add interface

2017-08-05 Thread Shaohua Li
From: Kyungchan Koh The testb block device driver is intended for testing, so configuration should be easy. We are using configfs here, which can be configured with a shell script. Basically the the testb will be configured as: mount the configfs fs as usual: mount -t configfs

[PATCH 4/5] testb: emulate disk cache

2017-08-05 Thread Shaohua Li
From: Kyungchan Koh Software must flush disk cache to guarantee data safety. To check if software correctly does disk cache flush, we must know the behavior of disk. But physical disk behavior is uncontrollable. Even software doesn't do the flush, the disk probably does the

[PATCH 0/5] block: a virtual block device driver for testing

2017-08-05 Thread Shaohua Li
From: Shaohua Li In testing software RAID, I usually found it's hard to cover specific cases. RAID is supposed to work even disk is in semi good state, for example, some sectors are broken. Since we can't control the behavior of hardware, it's difficult to create test suites to do

[PATCH 5/5] testb: badblock support

2017-08-05 Thread Shaohua Li
From: Shaohua Li Sometime disk could have tracks broken and data there is inaccessable, but data in other parts can be accessed in normal way. MD RAID supports such disks. But we don't have a good way to test it, because we can't control which part of a physical disk is bad. For a

[PATCH 3/5] testb: implement bandwidth control

2017-08-05 Thread Shaohua Li
From: Kyungchan Koh In test, we usually expect controllable disk speed. For example, in a raid array, we'd like some disks are fast and some are slow. MD RAID actually has a feature for this. To test the feature, we'd like to make the disk run in specific speed. block throttling