Re: [PATCH V2 00/16] Introduce the BFQ I/O scheduler

2017-04-11 Thread Paolo Valente

> Il giorno 10 apr 2017, alle ore 18:56, Bart Van Assche 
>  ha scritto:
> 
> On Fri, 2017-03-31 at 14:47 +0200, Paolo Valente wrote:
>> [ ... ]
> 
> Hello Paolo,
> 
> Is the git tree that is available at https://github.com/Algodev-github/bfq-mq
> appropriate for testing BFQ? If I merge that tree with v4.11-rc6 and if I run
> the srp-test software against that tree as follows:
> 
>./run_tests -e bfq-mq -t 02-mq
> 
> then the following appears on the console:
> 
> [ 2748.650352] BUG: unable to handle kernel NULL pointer dereference at 
> 00d0
> [ 2748.650442] IP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
> [ 2748.650509] PGD 0 
> [ 2748.650511] 
> [ 2748.650585] Oops:  [#1] SMP
> [ 2748.651107] CPU: 9 PID: 10772 Comm: kworker/9:2H Tainted: G  I 
> 4.11.0-rc6-dbg+ #1
> [ 2748.651191] Workqueue: kblockd blk_mq_requeue_work
> [ 2748.651228] task: 88037c808040 task.stack: c90003b4c000
> [ 2748.651268] RIP: 0010:__bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
> [ 2748.651307] RSP: 0018:c90003b4f9d8 EFLAGS: 00010002
> [ 2748.651345] RAX: 0001 RBX:  RCX: 
> 0001
> [ 2748.651383] RDX: 0001 RSI: 880377f52e80 RDI: 
> 880401f774e8
> [ 2748.651423] RBP: c90003b4fa80 R08: 9093955f R09: 
> 0001
> [ 2748.651464] R10: c90003b4fa00 R11: a06d0d53 R12: 
> 880401f77840
> [ 2748.651506] R13: 880401f774e8 R14: 880378a451e0 R15: 
> 
> [ 2748.651547] FS:  () GS:88046f04() 
> knlGS:
> [ 2748.651588] CS:  0010 DS:  ES:  CR0: 80050033
> [ 2748.651626] CR2: 00d0 CR3: 01c0f000 CR4: 
> 001406e0
> [ 2748.651664] Call Trace:
> [ 2748.651778]  bfq_insert_request+0x83/0x280 [bfq_mq_iosched]
> [ 2748.651934]  bfq_insert_requests+0x50/0x70 [bfq_mq_iosched]
> [ 2748.651975]  blk_mq_sched_insert_request+0x11e/0x170
> [ 2748.652015]  blk_insert_cloned_request+0xb6/0x1f0
> [ 2748.652361]  map_request+0x13c/0x290 [dm_mod]
> [ 2748.652403]  dm_mq_queue_rq+0x90/0x160 [dm_mod]
> [ 2748.652441]  blk_mq_dispatch_rq_list+0x1f2/0x3e0
> [ 2748.652479]  blk_mq_sched_dispatch_requests+0xf1/0x190
> [ 2748.652516]  __blk_mq_run_hw_queue+0x12d/0x1c0
> [ 2748.652553]  __blk_mq_delay_run_hw_queue+0xe3/0xf0
> [ 2748.652593]  blk_mq_run_hw_queues+0x5c/0x80
> [ 2748.652632]  blk_mq_requeue_work+0x132/0x150
> [ 2748.652671]  process_one_work+0x206/0x6a0
> [ 2748.652709]  worker_thread+0x49/0x4a0
> [ 2748.652745]  kthread+0x107/0x140
> [ 2748.652854]  ret_from_fork+0x2e/0x40
> [ 2748.652891] Code: ff 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 
> 83 c4 80 8b 87 58 03 00 00 48 8b 9e b0 00 00 00 85 c0 0f 84 8b 04 00 00 <48> 
> 8b 83 d0 00 00 00 48 85 c0 0f 84 63 04 00 00
> 48 83 e8 10 48 
> [ 2748.653049] RIP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched] RSP: 
> c90003b4f9d8
> [ 2748.653090] CR2: 00d0
> 
> The crash address corresponds to the following source code according to gdb:
> 
> (gdb) list *(__bfq_insert_request+0x26)
> 0xd6f6 is in __bfq_insert_request (block/bfq-mq-iosched.c:4430).
> 4425
> 4426static void __bfq_insert_request(struct bfq_data *bfqd, struct 
> request *rq)
> 4427{
> 4428struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq;
> 4429
> 4430assert_spin_locked(&bfqd->lock);
> 4431
> 4432bfq_log_bfqq(bfqd, bfqq, "__insert_req: rq %p bfqq %p", rq, 
> bfqq);
> 4433
> 4434/*
> 

Hi Bart,
I've tried to figure out how to deal with this crash, but I didn't
find any sensible way to go, for the following two reasons.

First, if I'm not missing anything, then I don't yet have the hardware
required to run the srp-test.  So, I cannot easily reproduce this
failure.  Actually, BFQ is not yet suitable, and maybe will never be
in its current design, for very high-speed hardware as InfiniBand and
NVMe devices.

Second, a NULL-pointer fault at the line you report is rather weird.
In fact, the sequence of C-code instructions executed up to that line
is:

struct bfq_data *bfqd = q->elevator->elevator_data;
...
spin_lock_irq(&bfqd->lock);
__bfq_insert_request(bfqd, rq);
/* inside the __bfq_insert_request function: */
struct bfq_queue *bfqq = RQ_BFQQ(rq), ...;
assert_spin_locked(&bfqd->lock);

So, how can the last line cause a NULL-pointer-dereference exception
on the same address, &bfqd->lock, on which spin_lock_irq(&bfqd->lock);
was happy to work to get a spin lock?

Any idea on how to proceed?  If this strage bug remains hard to spot,
then, if you agree, I will go on in the meanwhile with submitting a
new version of the patch series, which addresses your other issues.

Thanks,
Paolo

> Bart.



Re: [PATCH V2 00/16] Introduce the BFQ I/O scheduler

2017-04-10 Thread Bart Van Assche
On Fri, 2017-03-31 at 14:47 +0200, Paolo Valente wrote:
> [ ... ]

Hello Paolo,

Is the git tree that is available at https://github.com/Algodev-github/bfq-mq
appropriate for testing BFQ? If I merge that tree with v4.11-rc6 and if I run
the srp-test software against that tree as follows:

./run_tests -e bfq-mq -t 02-mq

then the following appears on the console:

[ 2748.650352] BUG: unable to handle kernel NULL pointer dereference at 
00d0
[ 2748.650442] IP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
[ 2748.650509] PGD 0 
[ 2748.650511] 
[ 2748.650585] Oops:  [#1] SMP
[ 2748.651107] CPU: 9 PID: 10772 Comm: kworker/9:2H Tainted: G  I 
4.11.0-rc6-dbg+ #1
[ 2748.651191] Workqueue: kblockd blk_mq_requeue_work
[ 2748.651228] task: 88037c808040 task.stack: c90003b4c000
[ 2748.651268] RIP: 0010:__bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
[ 2748.651307] RSP: 0018:c90003b4f9d8 EFLAGS: 00010002
[ 2748.651345] RAX: 0001 RBX:  RCX: 0001
[ 2748.651383] RDX: 0001 RSI: 880377f52e80 RDI: 880401f774e8
[ 2748.651423] RBP: c90003b4fa80 R08: 9093955f R09: 0001
[ 2748.651464] R10: c90003b4fa00 R11: a06d0d53 R12: 880401f77840
[ 2748.651506] R13: 880401f774e8 R14: 880378a451e0 R15: 
[ 2748.651547] FS:  () GS:88046f04() 
knlGS:
[ 2748.651588] CS:  0010 DS:  ES:  CR0: 80050033
[ 2748.651626] CR2: 00d0 CR3: 01c0f000 CR4: 001406e0
[ 2748.651664] Call Trace:
[ 2748.651778]  bfq_insert_request+0x83/0x280 [bfq_mq_iosched]
[ 2748.651934]  bfq_insert_requests+0x50/0x70 [bfq_mq_iosched]
[ 2748.651975]  blk_mq_sched_insert_request+0x11e/0x170
[ 2748.652015]  blk_insert_cloned_request+0xb6/0x1f0
[ 2748.652361]  map_request+0x13c/0x290 [dm_mod]
[ 2748.652403]  dm_mq_queue_rq+0x90/0x160 [dm_mod]
[ 2748.652441]  blk_mq_dispatch_rq_list+0x1f2/0x3e0
[ 2748.652479]  blk_mq_sched_dispatch_requests+0xf1/0x190
[ 2748.652516]  __blk_mq_run_hw_queue+0x12d/0x1c0
[ 2748.652553]  __blk_mq_delay_run_hw_queue+0xe3/0xf0
[ 2748.652593]  blk_mq_run_hw_queues+0x5c/0x80
[ 2748.652632]  blk_mq_requeue_work+0x132/0x150
[ 2748.652671]  process_one_work+0x206/0x6a0
[ 2748.652709]  worker_thread+0x49/0x4a0
[ 2748.652745]  kthread+0x107/0x140
[ 2748.652854]  ret_from_fork+0x2e/0x40
[ 2748.652891] Code: ff 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 
83 c4 80 8b 87 58 03 00 00 48 8b 9e b0 00 00 00 85 c0 0f 84 8b 04 00 00 <48> 8b 
83 d0 00 00 00 48 85 c0 0f 84 63 04 00 00
48 83 e8 10 48 
[ 2748.653049] RIP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched] RSP: 
c90003b4f9d8
[ 2748.653090] CR2: 00d0

The crash address corresponds to the following source code according to gdb:

(gdb) list *(__bfq_insert_request+0x26)
0xd6f6 is in __bfq_insert_request (block/bfq-mq-iosched.c:4430).
4425
4426    static void __bfq_insert_request(struct bfq_data *bfqd, struct request 
*rq)
4427    {
4428    struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq;
4429
4430    assert_spin_locked(&bfqd->lock);
4431
4432    bfq_log_bfqq(bfqd, bfqq, "__insert_req: rq %p bfqq %p", rq, 
bfqq);
4433
4434    /*

Bart.

[PATCH V2 00/16] Introduce the BFQ I/O scheduler

2017-03-31 Thread Paolo Valente
Hi,
with respect to the previous submission [1], these new patch series:
- contains all the changes suggested by Jens and Bart [1], apart from
  those for which I raised doubts that either have been acknowledged,
  or have not received a reply yet (I will of course apply also the
  latter changes if those threads restart);
- contains a fix to the bug causing the failure reported by Jens [2].

As for major changes, this patch series:
- solves the nesting problem between scheduler and io-context locks, by
  not taking any reference to io contexts anymore [3];
- splits the original, single source file into three files.

These last two contributions are provided by two additional patches in
the series. I've not merged these changes into the other patches for
the following reasons:

- Merging these changes would have implied splitting them into further
  smaller pieces, applying each piece to the right previous patch, and
  solving all the conflicts generated by each per-patch
  modification. This would have taken really a lot of time, and would
  have implied a certain probability of introducing subtle errors (I
  have tried for a few days, and then abandoned this solution).

- The removal of extra io-context references is a non-trivial change
  to code that has worked the other way, for probably about a decade,
  in CFQ.  The change seems to be fine, but in case of errors, it is
  probably much easier to find and clearly fix them, if they are
  confined in a single commit.

- A dedicated commit for the removal of extra io-context references
  also documents how it has been obtained, and what assumptions have
  been made.

- Similarly, an explicit split of the srouce file shows where each
  piece has gone, instead of exposing only the result of the split,
  with possible mistakes buried in it.

I have run all the tests I could.

Some patch still generates WARNINGS with checkpatch.pl, but these
WARNINGS seem to be either unavoidable for the involved pieces of code
(which the patch just extends), or false positives.

Thanks,
Paolo

[1] https://lkml.org/lkml/2017/3/4/148
[2] https://lkml.org/lkml/2017/3/6/887
[3] https://lkml.org/lkml/2017/3/18/34
Arianna Avanzini (4):
  block, bfq: add full hierarchical scheduling and cgroups support
  block, bfq: add Early Queue Merge (EQM)
  block, bfq: reduce idling only in symmetric scenarios
  block, bfq: handle bursts of queue activations

Paolo Valente (12):
  block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler
  block, bfq: improve throughput boosting
  block, bfq: modify the peak-rate estimator
  block, bfq: add more fairness with writes and slow processes
  block, bfq: improve responsiveness
  block, bfq: reduce I/O latency for soft real-time applications
  block, bfq: preserve a low latency also with NCQ-capable drives
  block, bfq: reduce latency during request-pool saturation
  block, bfq: boost the throughput on NCQ-capable flash-based devices
  block, bfq: boost the throughput with random I/O on NCQ-capable HDDs
  block, bfq: remove all get and put of I/O contexts
  block, bfq: split bfq-iosched.c into multiple source files

 Documentation/block/00-INDEX|2 +
 Documentation/block/bfq-iosched.txt |  531 
 block/Kconfig.iosched   |   21 +
 block/Makefile  |1 +
 block/bfq-cgroup.c  | 1139 
 block/bfq-iosched.c | 5030 +++
 block/bfq-iosched.h |  942 +++
 block/bfq-wf2q.c| 1616 +++
 include/linux/blkdev.h  |2 +-
 9 files changed, 9283 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/block/bfq-iosched.txt
 create mode 100644 block/bfq-cgroup.c
 create mode 100644 block/bfq-iosched.c
 create mode 100644 block/bfq-iosched.h
 create mode 100644 block/bfq-wf2q.c

--
2.10.0