[PATCH] block: kill all_q_node in request_queue

2019-04-18 Thread Hou Tao
all_q_node has not been used since commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU"), so remove it. Signed-off-by: Hou Tao --- include/linux/blkdev.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 5c

[PATCH 1/2] block: make rq sector size accessible for block stats

2019-05-21 Thread Hou Tao
ck: move blk_stat_add() to __blk_mq_end_request()") Signed-off-by: Hou Tao --- block/blk-mq.c | 11 +-- block/blk-throttle.c | 3 ++- include/linux/blkdev.h | 15 --- 3 files changed, 19 insertions(+), 10 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index

[PATCH 2/2] block: also check RQF_STATS in blk_mq_need_time_stamp()

2019-05-21 Thread Hou Tao
In __blk_mq_end_request() if block stats needs update, we should ensure now is valid instead of 0 even when iostat is disabled. Signed-off-by: Hou Tao --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 4d1462172f0f

[PATCH 0/2] fixes for block stats

2019-05-21 Thread Hou Tao
egard, Tao Hou Tao (2): block: make rq sector size accessible for block stats block: also check RQF_STATS in blk_mq_need_time_stamp() block/blk-mq.c | 17 - block/blk-throttle.c | 3 ++- include/linux/blkdev.h | 15 --- 3 files changed, 22 insertions(+

Re: [PATCH 0/2] fixes for block stats

2019-05-24 Thread Hou Tao
ping ? On 2019/5/21 15:59, Hou Tao wrote: > The first patch fixes the problem that there is no sample in > /sys/kernel/debug/block/nvmeXn1/poll_stat and hybrid poll may > don't work as expected. The second patch tries to ensure > the latency accounting for block stats will wor

Re: [PATCH 1/2] block: make rq sector size accessible for block stats

2019-05-27 Thread Hou Tao
imes (although no possible for nvme device), but __blk_mq_end_request() is only invoked once. Regards, Tao > > On 21/05/2019 10:59, Hou Tao wrote: >> Currently rq->data_len will be decreased by partial completion or >> zeroed by completion, so when blk_stat_add() is invoked,

BFQ: the purpose of idle rb-tree in bfq_service_tree

2017-05-25 Thread Hou Tao
Hi Paolo, I am reading the code of BFQ scheduler and having a question about the purpose of idle rb-tree in bfq_service_tree. >From the comment in code, the idle rb-tree is used to keep the bfq_queue which doesn't have any request and has a finish time greater than the vtime of the service tree.

Re: [PATCH v2] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-05-30 Thread Hou Tao
Hi Jens, I didn't found the patch in your linux-block git tree and the vanilla git tree. Maybe you have forgot this CFQ fix ? Regards, Tao On 2017/3/9 19:22, Hou Tao wrote: > On 2017/3/8 22:05, Jan Kara wrote: >> On Wed 08-03-17 20:16:55, Hou Tao wrote: >>> When adding a

[PATCH] blkcg: kill unused field nr_undestroyed_grps

2016-07-26 Thread Hou Tao
'nr_undestroyed_grps' in struct throtl_data was used to count the number of throtl_grp related with throtl_data, but now throtl_grp is tracked by blkcg_gq, so it is useless anymore. Signed-off-by: Hou Tao --- block/blk-throttle.c | 5 - 1 file changed, 5 deletions(-) diff --git a

[PATCH] block-throttle: fix throtl_log for throttled-bios dispatch

2016-09-10 Thread Hou Tao
queued=1/0 throtl /1 dispatch queued=2/0 .. throtl /1 dispatch disp=1 Signed-off-by: Hou Tao --- block/blk-throttle.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 47a3e54..c724c97 100644 --- a/blo

[PATCH] blk-throttle: fix infinite throttling caused by non-cascading timer wheel

2016-09-11 Thread Hou Tao
;s OK to renew the time slice. 2. If there is no queued bio, the time slice must have been expired, so it's Ok to renew the time slice. Signed-off-by: Hou Tao --- block/blk-throttle.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/blo

Re: [PATCH] blk-throttle: fix infinite throttling caused by non-cascading timer wheel

2016-09-16 Thread Hou Tao
> Thanks for the patch. I can reproduce it. I am wondering that why are you > doing so many checks. Can't we just check if throttle group is empty or > not. If it is empty and slice has expired, then start a new slice. If > throttle group is not empty, then we know slice has to be an active slice

Re: [PATCH 02/11] block: Fix race of bdev open with gendisk shutdown

2017-11-16 Thread Hou Tao
Hi Jan, On 2017/3/13 23:14, Jan Kara wrote: > blkdev_open() may race with gendisk shutdown in two different ways. > Either del_gendisk() has already unhashed block device inode (and thus > bd_acquire() will end up creating new block device inode) however > gen_gendisk() will still return the gendi

Re: [PATCH 02/11] block: Fix race of bdev open with gendisk shutdown

2017-11-23 Thread Hou Tao
Hi Jan, On 2017/11/21 0:43, Jan Kara wrote: > Hi Tao! > > On Fri 17-11-17 14:51:18, Hou Tao wrote: >> On 2017/3/13 23:14, Jan Kara wrote: >>> blkdev_open() may race with gendisk shutdown in two different ways. >>> Either del_gendisk() has already unha

[PATCH] blktrace: output io cgroup name for cgroup v1

2017-12-27 Thread Hou Tao
and cgrp_dfl_root is only valid for cgroup v2. So fix cgroup_path_from_kernfs_id() to support both cgroup v1 and v2. Fixes: 69fd5c3 ("blktrace: add an option to allow displaying cgroup path") Signed-off-by: Hou Tao --- include/linux/cgroup.h | 6 +++--- kernel/cgroup/

[PATCH] blktrace: support enabling cgroup info per-device

2018-01-10 Thread Hou Tao
ct_mask in struct blk_user_trace_setup and a new attr file (cgroup_info) under /sys/block/$dev/trace dir, so BLKTRACESETUP ioctl and sysfs file can be used to enable cgroup info for selected block devices. Signed-off-by: Hou Tao --- include/linux/blktrace_api.h | 2 ++ include/

Re: blkdev loop UAF

2018-01-11 Thread Hou Tao
Hi, On 2018/1/11 16:24, Dan Carpenter wrote: > Thanks for your report and the patch. I am sending it to the > linux-block devs since it's already public. > > regards, > dan carpenter The User-after-free problem is not specific for loop device, it can also be reproduced on scsi device, and there

Re: [PATCH] blktrace: support enabling cgroup info per-device

2018-01-16 Thread Hou Tao
Hi Jens, Any comments on this patch and the related patch set for blktrace [1] ? Regards, Tao [1]: https://www.spinics.net/lists/linux-btrace/msg00790.html On 2018/1/11 12:09, Hou Tao wrote: > Now blktrace supports outputting cgroup info for trace action and > trace message, however,

Re: [PATCH] blktrace: support enabling cgroup info per-device

2018-01-22 Thread Hou Tao
Hi Jens, Could you please look at this patch and the related patch set for blktrace [1], and give some feedback ? Regards, Tao [1]: https://www.spinics.net/lists/linux-btrace/msg00790.html On 2018/1/17 14:10, Hou Tao wrote: > Hi Jens, > > Any comments on this patch and the related

[PATCH] block, bfq: dispatch request to prevent queue stalling after the request completion

2017-07-11 Thread Hou Tao
n the remaining requests of busy bfq queue will stalled in the bfq schedule until a new request arrives. To fix the scheduler latency problem, we need to check whether or not all issued requests have completed and dispatch more requests to driver if there is no request in driver. Signed-off-by: Ho

[PATCH] block, bfq: fix typos in comments about B-WF2Q+ algorithm

2017-07-12 Thread Hou Tao
The start time of eligible entity should be less than or equal to the current virtual time, and the entity in idle tree has a finish time being greater than the current virtual time. Signed-off-by: Hou Tao --- block/bfq-iosched.h | 2 +- block/bfq-wf2q.c| 2 +- 2 files changed, 2 insertions

Re: [PATCH] block, bfq: dispatch request to prevent queue stalling after the request completion

2017-07-12 Thread Hou Tao
On 2017/7/12 17:41, Paolo Valente wrote: > >> Il giorno 11 lug 2017, alle ore 15:58, Hou Tao ha >> scritto: >> >> There are mq devices (eg., virtio-blk, nbd and loopback) which don't >> invoke blk_mq_run_hw_queues() after the completion of a request. >

Re: [PATCH 4/6] genhd: Fix use after free in __blkdev_get()

2018-02-12 Thread Hou Tao
- remember this was old device - this was last ref and disk is > now freed > } > disk_unblock_events(disk); -> oops > > Fix the problem by making sure we drop reference to disk in > __blkdev_get() only after we are really done with it. > > Re

Re: [PATCH 5/6] genhd: Fix BUG in blkdev_open()

2018-02-12 Thread Hou Tao
ever manage to look up > newly created bdev inode, we are also guaranteed that following > get_gendisk() will either return failure (and we fail open) or it > returns gendisk for the new device and following bdget_disk() will > return new bdev inode (i.e., blkdev_open() follows t

[PATCH RFC 1/4] dm thin: add a pool feature "keep_bio_blkcg"

2017-01-20 Thread Hou Tao
"keep_bio_blkcg" is used to control whether or not dm-thin needs to save the original blkcg of bio Signed-off-by: Hou Tao --- drivers/md/dm-thin.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index d1c05c1..8178ee8 100644 ---

[PATCH RFC 2/4] dm thin: parse "keep_bio_blkcg" from userspace tools

2017-01-20 Thread Hou Tao
keep_bio_blkcg feature is off by default, and it can be turned on by using "keep_bio_blkcg" argument. Signed-off-by: Hou Tao --- drivers/md/dm-thin.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 8178ee

[PATCH RFC 4/4] dm thin: associate bio with current task if keep_bio_blkcg is enabled

2017-01-20 Thread Hou Tao
If keep_bio_blkcg is enabled, assign the io_context and the blkcg of current task to bio before processing the bio. Signed-off-by: Hou Tao --- drivers/md/dm-thin.c | 5 + drivers/md/dm-thin.h | 17 + 2 files changed, 22 insertions(+) create mode 100644 drivers/md/dm-thin.h

[PATCH RFC 0/4] dm thin: support blk-throttle on data and metadata device

2017-01-20 Thread Hou Tao
only set a limitation on the blkcg of the original IO thread, so the blk-throttle doesn't work well. In order to handle the situation, we add a "keep_bio_blkcg" feature to dm-thin. If the feature is enabled, the original blkcg of bio will be saved at thin_map() and will be used during blk-t

[PATCH RFC 3/4] dm thin: show the enabled status of keep_bio_blkcg feature

2017-01-20 Thread Hou Tao
If keep_bio_blkcg feature is enabled, we can ensure that by STATUSTYPE_TABLE or STATUSTYPE_INFO command. Signed-off-by: Hou Tao --- drivers/md/dm-thin.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 57d6202..140cdae

[PATCH] blkcg: fix double free of new_blkg in blkcg_init_queue

2017-01-24 Thread Hou Tao
if blkg_create fails, new_blkg passed as an argument will be freed by blkg_create, so there is no need to free it again. Signed-off-by: Hou Tao --- block/blk-cgroup.c | 1 - 1 file changed, 1 deletion(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 8ba0af7..58fb0dd 100644 --- a

[PATCH RESEND] blkcg: fix double free of new_blkg in blkcg_init_queue

2017-02-03 Thread Hou Tao
If blkg_create fails, new_blkg passed as an argument will be freed by blkg_create, so there is no need to free it again. Signed-off-by: Hou Tao --- block/blk-cgroup.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 8ba0af7

rate-capped fio jobs in a CFQ group degrade performance of fio jobs in another CFQ group with the same weight

2017-02-09 Thread Hou Tao
Hi all, During our test of the CFQ group schedule, we found a performance related problem. Rate-capped fio jobs in a CFQ group will degrade the performance of fio jobs in another CFQ group. Both of the CFQ groups have the same blkio.weight. We launch two fios in difference terminals. The followi

[PATCH] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-02-28 Thread Hou Tao
iops delay and lead to an abnormal io schedule delay for the added cfq_group. To fix it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5 when iops mode is enabled. Cc: # 4.8+ Signed-off-by: Hou Tao --- block/cfq-iosched.c | 11 ++- 1 file changed, 10 insertions(+), 1 de

[PATCH] block: use put_io_context_active() to disassociate bio from a task

2017-03-02 Thread Hou Tao
. Signed-off-by: Hou Tao --- block/bio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index 5eec5e0..d8ed36f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -2072,7 +2072,7 @@ EXPORT_SYMBOL_GPL(bio_associate_current); void bio_disassociate_task

Re: [PATCH] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-03 Thread Hou Tao
On 2017/3/2 18:29, Jan Kara wrote: > On Wed 01-03-17 10:07:44, Hou Tao wrote: >> When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY >> as the delay of cfq_group's vdisktime if there have been other cfq_groups >> already. >> >> When cfq i

Re: [PATCH] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-06 Thread Hou Tao
Hi Vivek, On 2017/3/4 3:53, Vivek Goyal wrote: > On Fri, Mar 03, 2017 at 09:20:44PM +0800, Hou Tao wrote: > > [..] >>> Frankly, vdisktime is in fixed-point precision shifted by >>> CFQ_SERVICE_SHIFT so using CFQ_IDLE_DELAY does not make much sense in any >>&g

cfq-iosched: two questions about the hrtimer version of CFQ

2017-03-06 Thread Hou Tao
Hi Jan and list, When testing the hrtimer version of CFQ, we found a performance degradation problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at least 1 jiffie instead of 1 ns"). The following is the test process: * filesystem and block device * XFS + /dev/sda mou

cfq-iosched: two questions about the hrtimer version of CFQ

2017-03-06 Thread Hou Tao
Hi Jan and list, When testing the hrtimer version of CFQ, we found a performance degradation problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at least 1 jiffie instead of 1 ns"). The following is the test process: * filesystem and block device * XFS + /dev/sda mou

Re: cfq-iosched: two questions about the hrtimer version of CFQ

2017-03-06 Thread Hou Tao
Sorry for the resend, please refer to the later one. On 2017/3/6 21:50, Hou Tao wrote: > Hi Jan and list, > > When testing the hrtimer version of CFQ, we found a performance degradation > problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at > least 1 ji

[PATCH v2] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-08 Thread Hou Tao
so I define two new macros for the delay of a cfq_group under time-slice mode and IOPs mode. Fixes: 9a7f38c42c2b92391d9dabaf9f51df7cfe5608e4 Cc: # 4.8+ Signed-off-by: Hou Tao --- block/cfq-iosched.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) v2: - use cons

Re: [PATCH v2] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-09 Thread Hou Tao
On 2017/3/8 22:05, Jan Kara wrote: > On Wed 08-03-17 20:16:55, Hou Tao wrote: >> When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY >> as the delay of cfq_group's vdisktime if there have been other cfq_groups >> already. >> >> When cfq i

[RFC PATCH 1/2] block: add support for redirecting IO completion through eBPF

2019-10-14 Thread Hou Tao
eBPF program to the request-queue, provide some useful info (e.g., the CPU which submits the request) to the program, and let the program decides the proper CPU for IO completion handling. Signed-off-by: Hou Tao --- block/Makefile | 2 +- block/blk-bpf.c| 127

[RFC PATCH 2/2] selftests/bpf: add test program for redirecting IO completion CPU

2019-10-14 Thread Hou Tao
distribute the IO completion of nvme0n1 to a specific CPU set: ./test_blkdev_ccpu -d /dev/nvme0n1 -s 4,8,10-13 Signed-off-by: Hou Tao --- tools/include/uapi/linux/bpf.h| 2 + tools/lib/bpf/libbpf.c| 1 + tools/lib/bpf/libbpf_probes.c

[RFC PATCH 0/2] block: use eBPF to redirect IO completion

2019-10-14 Thread Hou Tao
completion handling to all online CPUs or a specific CPU set: ./test_blkdev_ccpu -d /dev/vda or ./test_blkdev_ccpu -d /dev/nvme0n1 -s 4,8,10-13 However I am still trying to find out a killer scenario for the eBPF redirection, so suggestions and comments are welcome. Regards, Tao Hou

Re: [RFC PATCH 1/2] block: add support for redirecting IO completion through eBPF

2019-10-21 Thread Hou Tao
Hi, On 2019/10/16 5:04, Alexei Starovoitov wrote: > On Mon, Oct 14, 2019 at 5:21 AM Hou Tao wrote: >> >> For network stack, RPS, namely Receive Packet Steering, is used to >> distribute network protocol processing from hardware-interrupted CPU >> to specific CPUs and a

Re: [RFC PATCH 0/3] md: export internal stats through debugfs

2019-07-26 Thread Hou Tao
Hi, On 2019/7/23 5:31, Song Liu wrote: > On Tue, Jul 2, 2019 at 6:25 AM Hou Tao wrote: >> >> Hi, >> >> There are so many io counters, stats and flags in md, so I think >> export these info to userspace will be helpful for online-debugging, >> especia

Re: [RFC PATCH 0/3] md: export internal stats through debugfs

2019-07-26 Thread Hou Tao
Hi, On 2019/7/23 7:30, Bob Liu wrote: > On 7/2/19 9:29 PM, Hou Tao wrote: >> Hi, >> >> There are so many io counters, stats and flags in md, so I think >> export these info to userspace will be helpful for online-debugging, > > For online-debugging, I'd su

[PATCH] virtio_pmem: do flush synchronously

2023-06-19 Thread Hou Tao
From: Hou Tao The following warning was reported when doing fsync on a pmem device: [ cut here ] WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct+0x340/0x520 Modules linked in: CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154 Hardware

[PATCH v2] virtio_pmem: add the missing REQ_OP_WRITE for flush bio

2023-06-21 Thread Hou Tao
From: Hou Tao The following warning was reported when doing fsync on a pmem device: [ cut here ] WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct+0x340/0x520 Modules linked in: CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154 Hardware

[PATCH v3] virtio_pmem: add the missing REQ_OP_WRITE for flush bio

2023-06-24 Thread Hou Tao
From: Hou Tao When doing mkfs.xfs on a pmem device, the following warning was reported and : [ cut here ] WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct Modules linked in: CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154 Hardware name

Re: [PATCH v2] virtio_pmem: add the missing REQ_OP_WRITE for flush bio

2023-06-29 Thread Hou Tao
Hi Pankaj, On 6/22/2023 4:35 PM, Pankaj Gupta wrote: >> The following warning was reported when doing fsync on a pmem device: >> >> [ cut here ] >> WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 >> submit_bio_noacct+0x340/0x520 SNIP >> Hi Jens & Dan, >> >> I found Pank

[PATCH v4] virtio_pmem: add the missing REQ_OP_WRITE for flush bio

2023-07-13 Thread Hou Tao
From: Hou Tao When doing mkfs.xfs on a pmem device, the following warning was reported: [ cut here ] WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct Modules linked in: CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154 Hardware name: QEMU

Re: [PATCH v3] virtio_pmem: add the missing REQ_OP_WRITE for flush bio

2023-07-13 Thread Hou Tao
Hi Pankaj, On 7/13/2023 4:23 PM, Pankaj Gupta wrote: > +Cc Vishal, > >>> Fixes: b4a6bb3a67aa ("block: add a sanity check for non-write flush/fua >>> bios") >>> Signed-off-by: Hou Tao >> With 6.3+ stable Cc, Feel free to add: > Hi Dan, Vishal,