Christoph,
> We can simply add another bio flag to get back to the previous
> behavior. That being said thing to do in the end is to verify it
> at the top of the stack, and not the bottom eventuall. I can cook
> up a patch for that.
Yeah, the original code was careful about only adding the
Mikulas,
> The sector number in the integrity tag must match the physical sector
> number. So, it must be verified at the bottom.
The ref tag seed matches the submitter block number (typically block
layer sector for the top device) and is remapped to and from the LBA by
the SCSI disk driver or
On Fri, 2017-08-04 at 13:56 -0600, Jens Axboe wrote:
> On 08/04/2017 01:44 PM, Bart Van Assche wrote:
> > On Fri, 2017-08-04 at 09:04 -0600, Jens Axboe wrote:
> > > @@ -98,11 +98,13 @@ static void blk_mq_check_inflight(struct
> > > blk_mq_hw_ctx *hctx,
> > > return;
> > >
> > > /*
>
On Sat, Aug 5, 2017 at 8:51 AM, Shaohua Li wrote:
> From: Shaohua Li
>
> Sometime disk could have tracks broken and data there is inaccessable,
> but data in other parts can be accessed in normal way. MD RAID supports
> such disks. But we don't have a good way to
On Thu, Aug 03, 2017 at 05:33:13PM +, Bart Van Assche wrote:
> Are you aware that the SCSI core already keeps track of the number of busy
> requests
> per LUN? See also the device_busy member of struct scsi_device. How about
> giving the
> block layer core access in some way to that counter?
On Sat, Aug 05, 2017 at 12:05:00AM +0200, Paolo Valente wrote:
> >
> > True. However, the difference between legacy-deadline mq-deadline is
> > roughly around the 5-10% mark across workloads for SSD. It's not
> > universally true but the impact is not as severe. While this is not
> > proof that
On Thu, Aug 03, 2017 at 05:33:13PM +, Bart Van Assche wrote:
> On Thu, 2017-08-03 at 11:13 +0800, Ming Lei wrote:
> > On Thu, Aug 03, 2017 at 01:35:29AM +, Bart Van Assche wrote:
> > > On Wed, 2017-08-02 at 11:31 +0800, Ming Lei wrote:
> > > > On Tue, Aug 01, 2017 at 03:11:42PM +, Bart
On Thu, Aug 03, 2017 at 10:10:55AM -0400, Mikulas Patocka wrote:
> That dm-crypt commit that uses bio integrity payload came 3 months before
> 7c20f11680a441df09de7235206f70115fbf6290 and it was already present in
> 4.12.
And on it's own that isn't an argument if your usage is indeed wrong,
We need to iterate ctx starting from one offset in way
of round robin, so introduce this helper.
Cc: Omar Sandoval
Signed-off-by: Ming Lei
---
include/linux/sbitmap.h | 54 -
1 file changed, 40 insertions(+),
This function is introduced for dequeuing request
from sw queue so that we can dispatch it in scheduler's way.
More importantly, for some SCSI devices, driver
tags are host wide, and the number is quite big,
but each lun has very limited queue depth. This
function is introduced for avoiding to
SCSI devices often provides one per-request_queue depth via
q->queue_depth(.cmd_per_lun), which is a global limit on all
hw queues. After the pending I/O submitted to one rquest queue
reaches this limit, BLK_STS_RESOURCE will be returned to all
dispatch path. That means when one hw queue is stuck,
The following patch will propose some hints to figure out
default queue depth for scheduler queue, so introduce helper
of blk_mq_sched_queue_depth() for this purpose.
Reviewed-by: Christoph Hellwig
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 8 +---
When hw queue is busy, we shouldn't take requests from
scheduler queue any more, otherwise IO merge will be
difficult to do.
This patch fixes the awful IO performance on some
SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber
is used by not taking requests if hw queue is busy.
Reviewed-by:
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 6 +++---
block/blk-mq.c | 4 ++--
block/blk-mq.h | 15 +++
3 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index
So that it becomes easy to support to dispatch from
sw queue in the following patch.
No functional change.
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 28 ++--
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git
Prepare to support per-request-queue dispatch list,
so introduce dispatch lock and list for avoiding to
do runtime check.
Signed-off-by: Ming Lei
---
block/blk-mq-debugfs.c | 10 +-
block/blk-mq-sched.c | 2 +-
block/blk-mq.c | 7 +--
block/blk-mq.h
In Red Hat internal storage test wrt. blk-mq scheduler, we
found that I/O performance is much bad with mq-deadline, especially
about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx,
SRP...)
Turns out one big issue causes the performance regression: requests
are still dequeued from
SCSI sets q->queue_depth from shost->cmd_per_lun, and
q->queue_depth is per request_queue and more related to
scheduler queue compared with hw queue depth, which can be
shared by queues, such as TAG_SHARED.
This patch trys to use q->queue_depth as hint for computing
q->nr_requests, which should
SCSI devices use host-wide tagset, and the shared
driver tag space is often quite big. Meantime
there is also queue depth for each lun(.cmd_per_lun),
which is often small.
So lots of requests may stay in sw queue, and we
always flush all belonging to same hw queue and
dispatch them all to driver,
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 13 +++--
block/blk-mq.c | 24 +++-
block/blk-mq.h | 40
3 files changed, 54 insertions(+), 23 deletions(-)
diff --git a/block/blk-mq-sched.c
During dispatching, we moved all requests from hctx->dispatch to
one temporary list, then dispatch them one by one from this list.
Unfortunately duirng this period, run queue from other contexts
may think the queue is idle, then start to dequeue from sw/scheduler
queue and still try to dispatch
We need to support per-request_queue dispatch list for avoiding
early dispatch in case of shared queue depth.
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 4 ++--
block/blk-mq.c | 2 +-
block/blk-mq.h | 19 +++
3 files changed, 14
So that we can reuse __elv_merge() to merge bio
into requests from sw queue in the following patches.
Signed-off-by: Ming Lei
---
block/elevator.c | 19 +--
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/block/elevator.c b/block/elevator.c
SCSI device often has per-request_queue queue depth
(.cmd_per_lun), which is applied among all hw queues
actually, and this patchset calls this as shared
queue depth.
One theory of scheduler is that we shouldn't dequeue
request from sw/scheduler queue and dispatch it to
driver when the low level
This patch introduces one function __blk_mq_try_merge()
which will be resued for bio merge to sw queue in
the following patch.
No functional change.
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 18 ++
1 file changed, 14 insertions(+), 4 deletions(-)
Prepare for supporting bio merge to sw queue if no
blk-mq io scheduler is taken.
Signed-off-by: Ming Lei
---
block/blk-mq.h | 4
block/blk.h | 3 +++
block/elevator.c | 22 +++---
3 files changed, 26 insertions(+), 3 deletions(-)
diff --git
This patch uses hash table to do bio merge from sw queue,
then we can align to blk-mq scheduler/block legacy's way
for bio merge.
Turns out bio merge via hash table is more efficient than
simple merge on the last 8 requests in sw queue. On SCSI SRP,
it is observed ~10% IOPS is increased in
blk_mq_sched_try_merge() will be reused in following patches
to support bio merge to blk-mq sw queue, so add checkes to
related functions which are called from blk_mq_sched_try_merge().
Signed-off-by: Ming Lei
---
block/elevator.c | 16
1 file changed, 16
We need this helpers for supporting to use hashtable to improve
bio merge from sw queue in the following patches.
No functional change.
Signed-off-by: Ming Lei
---
block/blk.h | 52
block/elevator.c | 36
On Sat, 5 Aug 2017, Christoph Hellwig wrote:
>
> On Thu, Aug 03, 2017 at 10:10:55AM -0400, Mikulas Patocka wrote:
> > That dm-crypt commit that uses bio integrity payload came 3 months before
> > 7c20f11680a441df09de7235206f70115fbf6290 and it was already present in
> > 4.12.
>
> And on
From: Kyungchan Koh
This create/remove disk when user writes 1/0 to 'power' attribute.
Signed-off-by: Kyungchan Koh
Signed-off-by: Shaohua Li
---
drivers/block/test_blk.c | 539 ++-
1 file changed, 538
From: Kyungchan Koh
The testb block device driver is intended for testing, so configuration
should be easy. We are using configfs here, which can be configured with
a shell script. Basically the the testb will be configured as:
mount the configfs fs as usual:
mount -t configfs
From: Kyungchan Koh
Software must flush disk cache to guarantee data safety. To check if
software correctly does disk cache flush, we must know the behavior of
disk. But physical disk behavior is uncontrollable. Even software
doesn't do the flush, the disk probably does the
From: Shaohua Li
In testing software RAID, I usually found it's hard to cover specific cases.
RAID is supposed to work even disk is in semi good state, for example, some
sectors are broken. Since we can't control the behavior of hardware, it's
difficult to create test suites to do
From: Shaohua Li
Sometime disk could have tracks broken and data there is inaccessable,
but data in other parts can be accessed in normal way. MD RAID supports
such disks. But we don't have a good way to test it, because we can't
control which part of a physical disk is bad. For a
From: Kyungchan Koh
In test, we usually expect controllable disk speed. For example, in a
raid array, we'd like some disks are fast and some are slow. MD RAID
actually has a feature for this. To test the feature, we'd like to make
the disk run in specific speed.
block throttling
36 matches
Mail list logo