Re: [PATCH 1/9] QUEUE_FLAG_NOWAIT to indicate device supports nowait

2017-08-09 Thread Shaohua Li
On Wed, Aug 09, 2017 at 05:16:23PM -0500, Goldwyn Rodrigues wrote: > > > On 08/09/2017 03:21 PM, Shaohua Li wrote: > > On Wed, Aug 09, 2017 at 10:35:39AM -0500, Goldwyn Rodrigues wrote: > >> > >> > >> On 08/09/2017 10:02 AM, Shaohua Li wrote: >

[PATCH V2 4/9] nullb: use ida to manage index

2017-08-14 Thread Shaohua Li
From: Shaohua Li We now dynamically create disks. Managing the disk index with ida to avoid bump up the index too much. Signed-off-by: Shaohua Li --- drivers/block/null_blk.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/block/null_blk.c b/drivers/block

[PATCH V2 3/9] nullb: add interface to power on disk

2017-08-14 Thread Shaohua Li
From: Shaohua Li The device created in nullb configfs interface isn't power on by default. After user configures the device, user can do 'echo 1 > xxx/nullb/device_name/power' to power on the device, which will create a disk. the xxx/nullb/device_name/index is the disk index,

[PATCH V2 8/9] nullb: emulate cache

2017-08-14 Thread Shaohua Li
From: Shaohua Li Software must flush disk cache to guarantee data safety. To check if software correctly does disk cache flush, we must know the behavior of disk. But physical disk behavior is uncontrollable. Even software doesn't do the flush, the disk probably does the flush. This patch

[PATCH V2 1/9] nullb: factor disk parameters

2017-08-14 Thread Shaohua Li
From: Shaohua Li When we switch to configfs interface, each disk could have different configuration. To prepare for the change, we move most disk setting to a separate data structure. The existing module parameter interface is kept. The 'nr_devices' and 'shared_tags' don&

[PATCH V2 9/9] nullb: badbblocks support

2017-08-14 Thread Shaohua Li
From: Shaohua Li Sometime disk could have tracks broken and data there is inaccessable, but data in other parts can be accessed in normal way. MD RAID supports such disks. But we don't have a good way to test it, because we can't control which part of a physical disk is bad. For a vi

[PATCH V2 2/9] nullb: add configfs interface

2017-08-14 Thread Shaohua Li
From: Shaohua Li Add configfs interface for nullb. configfs interface is more flexible and easy to configure in a per-disk basis. Configuration is something like this: mount -t configfs none /mnt Checking which features the driver supports: cat /mnt/nullb/features The 'features' at

[PATCH V2 0/9] nullb: extend nullb for destructive tests

2017-08-14 Thread Shaohua Li
From: Shaohua Li In testing software RAID, I usually found it's hard to cover specific cases. RAID is supposed to work even disk is in semi good state, for example, some sectors are broken. Since we can't control the behavior of hardware, it's difficult to create test s

[PATCH V2 6/9] nullb: support discard

2017-08-14 Thread Shaohua Li
From: Shaohua Li discard makes sense for memory backed disk. And also it's useful to test if upper layer supports dicard correctly. User configures 'discard' attribute to enable/disable dicard support. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh

[PATCH V2 5/9] nullb: support memory backed store

2017-08-14 Thread Shaohua Li
From: Shaohua Li This adds memory backed store in nullb. User configure 'memory_backed' attribute for this. By default, nullb disk doesn't use memory backed store. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh Signed-off-by: Shaohua Li --- drivers/b

[PATCH V2 7/9] nullb: bandwidth control

2017-08-14 Thread Shaohua Li
From: Shaohua Li In test, we usually expect controllable disk speed. For example, in a raid array, we'd like some disks are fast and some are slow. MD RAID actually has a feature for this. To test the feature, we'd like to make the disk run in specific speed. block throttling proba

[PATCH] blk-throttle: ignore discard request size

2017-08-18 Thread Shaohua Li
rd request into iops budget though. Signed-off-by: Shaohua Li --- block/blk-throttle.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 6a4c4c4..f80acc1 100644 --- a/block/blk-throttle.c +++ b/block/blk-

Re: [PATCH] blk-throttle: ignore discard request size

2017-08-18 Thread Shaohua Li
On Fri, Aug 18, 2017 at 09:35:01AM -0600, Jens Axboe wrote: > On 08/18/2017 09:13 AM, Shaohua Li wrote: > > discard request usually is very big and easily use all bandwidth budget > > of a cgroup. discard request size doesn't really mean the size of data > > written, s

Re: [PATCH] blk-throttle: ignore discard request size

2017-08-18 Thread Shaohua Li
On Fri, Aug 18, 2017 at 01:06:46PM -0600, Jens Axboe wrote: > On 08/18/2017 10:28 AM, Shaohua Li wrote: > > On Fri, Aug 18, 2017 at 09:35:01AM -0600, Jens Axboe wrote: > >> On 08/18/2017 09:13 AM, Shaohua Li wrote: > >>> discard request usually is very big and e

Re: [PATCH] blk-throttle: ignore discard request size

2017-08-18 Thread Shaohua Li
On Fri, Aug 18, 2017 at 01:15:15PM -0600, Jens Axboe wrote: > On 08/18/2017 01:12 PM, Shaohua Li wrote: > > On Fri, Aug 18, 2017 at 01:06:46PM -0600, Jens Axboe wrote: > >> On 08/18/2017 10:28 AM, Shaohua Li wrote: > >>> On Fri, Aug 18, 2017 at 09:35:01AM -0600, Jen

[PATCH V2] blk-throttle: cap discard request size

2017-08-18 Thread Shaohua Li
discard request does have cost. But it's not easy to find the actual cost. This patch simply makes the size one sector. Signed-off-by: Shaohua Li --- block/blk-throttle.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/block/blk-throttle.c b/block/

[RFC] block/loop: make loop cgroup aware

2017-08-23 Thread Shaohua Li
From: Shaohua Li Not a merge request, for discussion only. loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite c

Re: [PATCH 2/6] raid5: remove a call to get_start_sect

2017-08-23 Thread Shaohua Li
ctors); > return chunk_sectors >= > ((sector & (chunk_sectors - 1)) + bio_sectors); Reviewed-by: Shaohua Li

Re: [RFC] block/loop: make loop cgroup aware

2017-08-23 Thread Shaohua Li
On Wed, Aug 23, 2017 at 03:21:25PM -0400, Vivek Goyal wrote: > On Wed, Aug 23, 2017 at 11:15:15AM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Not a merge request, for discussion only. > > > > loop block device handles IO in a separate thread. The actu

[PATCH 1/2] block/loop: set hw_sectors

2017-08-23 Thread Shaohua Li
From: Shaohua Li Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 1 + 1 file changed, 1

[PATCH 0/2] block/loop: improve performance

2017-08-23 Thread Shaohua Li
From: Shaohua Li two small patches to improve performance for loop in directio mode. The goal is to increase IO size sending to underlayer disks. Thanks, Shaohua Shaohua Li (2): block/loop: set hw_sectors block/loop: allow request merge for directio mode drivers/block/loop.c | 44

[PATCH 2/2] block/loop: allow request merge for directio mode

2017-08-23 Thread Shaohua Li
From: Shaohua Li Currently loop disables merge. While it makes sense for buffer IO mode, directio mode can benefit from request merge. Without merge, loop could send small size IO to underlayer disk and harm performance. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 43

Re: [PATCH 2/2] block/loop: allow request merge for directio mode

2017-08-24 Thread Shaohua Li
On Thu, Aug 24, 2017 at 10:57:39AM -0700, Omar Sandoval wrote: > On Wed, Aug 23, 2017 at 04:49:24PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently loop disables merge. While it makes sense for buffer IO mode, > > directio mode can benefit from request

[PATCH V2 1/2] block/loop: set hw_sectors

2017-08-24 Thread Shaohua Li
From: Shaohua Li Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block/loop.c | 1

[PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-24 Thread Shaohua Li
From: Shaohua Li Currently loop disables merge. While it makes sense for buffer IO mode, directio mode can benefit from request merge. Without merge, loop could send small size IO to underlayer disk and harm performance. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block

[PATCH V2 0/2] block/loop: improve performance

2017-08-24 Thread Shaohua Li
From: Shaohua Li two small patches to improve performance for loop in directio mode. The goal is to increase IO size sending to underlayer disks. As Omar pointed out, the patches have slight conflict with his, but should be easy to fix. Thanks, Shaohua Shaohua Li (2): block/loop: set

[PATCH] block/nullb: fix NULL deference

2017-08-25 Thread Shaohua Li
L here. Reported-by: Dan Carpenter Signed-off-by: Shaohua Li --- drivers/block/null_blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c index 2032360..4d328e3 100644 --- a/drivers/block/null_blk.c +++ b/drivers/block/null_blk

[PATCH] block/nullb: delete unnecessary memory free

2017-08-28 Thread Shaohua Li
Commit 2984c86(nullb: factor disk parameters) has a typo. The nullb_device allocation/free is done outside of null_add_dev. The commit accidentally frees the nullb_device in error code path. Reported-by: Dan Carpenter Signed-off-by: Shaohua Li --- drivers/block/null_blk.c | 1 - 1 file changed

Re: [PATCH] block/nullb: delete unnecessary memory free

2017-08-28 Thread Shaohua Li
On Mon, Aug 28, 2017 at 02:55:59PM -0600, Jens Axboe wrote: > On 08/28/2017 02:49 PM, Shaohua Li wrote: > > Commit 2984c86(nullb: factor disk parameters) has a typo. The > > nullb_device allocation/free is done outside of null_add_dev. The commit > > accidentally frees the

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-29 Thread Shaohua Li
On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote: > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently loop disables merge. While it makes sense for buffer IO mode, > > directio mode can benefit from request merge

Re: [RFC] block/loop: make loop cgroup aware

2017-08-29 Thread Shaohua Li
On Mon, Aug 28, 2017 at 03:54:59PM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Wed, Aug 23, 2017 at 11:15:15AM -0700, Shaohua Li wrote: > > loop block device handles IO in a separate thread. The actual IO > > dispatched isn't cloned from the IO loop device received

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-29 Thread Shaohua Li
On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote: > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote: > > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote: > > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote: > &

Re: [RFC] block/loop: make loop cgroup aware

2017-08-29 Thread Shaohua Li
On Tue, Aug 29, 2017 at 08:28:09AM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Tue, Aug 29, 2017 at 08:22:36AM -0700, Shaohua Li wrote: > > > Yeah, writeback tracks the most active cgroup and associates writeback > > > ios with that cgroup. For buffered loop de

[PATCH 1/3] block/loop: don't hijack error number

2017-08-30 Thread Shaohua Li
If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li --- drivers/block/loop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index ef83349..054dccc 100644 --- a/drivers/block/l

[PATCH 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-08-30 Thread Shaohua Li
REQ_OP_WRITE_ZEROES really means zero the data. And the in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file changed, 3 insertions(+)

[PATCH 0/3]block/loop: handle discard/zeroout error

2017-08-30 Thread Shaohua Li
is correct behavior? Thanks, Shaohua Shaohua Li (3): block/loop: don't hijack error number block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES block/loop: suppress discard IO error message drivers/block/loop.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) -- 2.9.5

[PATCH 3/3] block/loop: suppress discard IO error message

2017-08-30 Thread Shaohua Li
h will suppress the IO error message Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index a30aa45..15f51e3 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -441,6 +441,9 @@ static

[PATCH] block/loop: fix use after free

2017-08-30 Thread Shaohua Li
lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. This conflicts with my direcio performance improvement patches, which I'll resend. Signed-off-

[PATCH] block/loop: fix use after feee

2017-08-30 Thread Shaohua Li
lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. This conflicts with my direcio performance improvement patches, which I'll resend. Signed-off-

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-30 Thread Shaohua Li
On Wed, Aug 30, 2017 at 02:43:40PM +0800, Ming Lei wrote: > On Tue, Aug 29, 2017 at 09:43:20PM -0700, Shaohua Li wrote: > > On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote: > > > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote: > > > > On Tue, Au

Re: [PATCH] block/loop: fix use after feee

2017-08-30 Thread Shaohua Li
On Wed, Aug 30, 2017 at 02:51:05PM -0700, Shaohua Li wrote: > lo_rw_aio->call_read_iter-> > 1 aops->direct_IO > 2 iov_iter_revert > lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could > be freed before 2, which accesses bvec. please ignore this

[PATCH V3 1/2] block/loop: set hw_sectors

2017-08-31 Thread Shaohua Li
From: Shaohua Li Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Reviewed-by: Omar Sandoval Reviewed-by: Ming Lei Signed-off-by: Shaohua Li

[PATCH V3 0/2] block/loop: improve performance

2017-08-31 Thread Shaohua Li
From: Shaohua Li two small patches to improve performance for loop in directio mode. The goal is to increase IO size sending to underlayer disks. Thanks, Shaohua V2 -> V3: - Use GFP_NOIO pointed out by Ming - Rebase to latest for-next branch Shaohua Li (2): block/loop: set hw_sect

[PATCH V3 2/2] block/loop: allow request merge for directio mode

2017-08-31 Thread Shaohua Li
From: Shaohua Li Currently loop disables merge. While it makes sense for buffer IO mode, directio mode can benefit from request merge. Without merge, loop could send small size IO to underlayer disk and harm performance. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block

[PATCH V2 2/2] block/loop: remove unused field

2017-09-01 Thread Shaohua Li
From: Shaohua Li nobody uses the list. Signed-off-by: Shaohua Li --- drivers/block/loop.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/loop.h b/drivers/block/loop.h index b0ba4a5..f68c1d5 100644 --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -67,7 +67,6 @@ struct

[PATCH V2 1/2] block/loop: fix use after free

2017-09-01 Thread Shaohua Li
From: Shaohua Li lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 16 +--- driv

Re: [PATCH V6 00/18] blk-throttle: add .low limit

2017-09-05 Thread Shaohua Li
On Thu, Aug 31, 2017 at 09:24:23AM +0200, Paolo VALENTE wrote: > > > Il giorno 15 gen 2017, alle ore 04:42, Shaohua Li ha scritto: > > > > Hi, > > > > cgroup still lacks a good iocontroller. CFQ works well for hard disk, but > > not > > much for

Re: Enable skip_copy can cause data integrity issue in some storage stack

2017-09-06 Thread Shaohua Li
On Fri, Sep 01, 2017 at 03:26:41PM +0800, alexwu wrote: > Hi, > > Recently a data integrity issue about skip_copy was found. We are able > to reproduce it and found the root cause. This data integrity issue > might happen if there are other layers between file system and raid5. > > [How to Reprod

Re: [PATCH V6 00/18] blk-throttle: add .low limit

2017-09-06 Thread Shaohua Li
On Wed, Sep 06, 2017 at 09:12:20AM +0800, Joseph Qi wrote: > Hi Shaohua, > > On 17/9/6 05:02, Shaohua Li wrote: > > On Thu, Aug 31, 2017 at 09:24:23AM +0200, Paolo VALENTE wrote: > >> > >>> Il giorno 15 gen 2017, alle ore 04:42, Shaohua Li ha > >>

[PATCH V2 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-09-06 Thread Shaohua Li
From: Shaohua Li REQ_OP_WRITE_ZEROES really means zero the data. And in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file chang

[PATCH V2 0/3] block/loop: handle discard/zeroout error

2017-09-06 Thread Shaohua Li
From: Shaohua Li Fix some problems when setting up loop device with a block device as back file and create/mount ext4 in the loop device. BTW: blkdev_issue_zeroout retries if we immediately find the device doesn't support zeroout, but it doesn't retry if submit_bio_wait returns -EOPN

[PATCH V2 1/3] block/loop: don't hijack error number

2017-09-06 Thread Shaohua Li
From: Shaohua Li If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li --- drivers/block/loop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 85de673..715b762 100644

[PATCH V2 3/3] block/loop: suppress discard IO error message

2017-09-06 Thread Shaohua Li
From: Shaohua Li We don't know if fallocate really supports FALLOC_FL_PUNCH_HOLE till fallocate is called. If it doesn't support, loop will return -EOPNOTSUPP and we see a lot of error message printed by blk_update_request. Failure for discard IO isn't a big problem, so we just r

[PATCH 0/3] block: make loop block device cgroup aware

2017-09-06 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

[PATCH 3/3] block/loop: make loop cgroup aware

2017-09-06 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH 2/3] block: make blkcg aware of kthread stored original cgroup info

2017-09-06 Thread Shaohua Li
From: Shaohua Li Several blkcg APIs are deprecated. After removing them, bio_blkcg is the only API to get cgroup info for a bio. If bio_blkcg finds current task is a kthread and has original css recorded, it will use the css instead of associating the bio to current task. Signed-off-by: Shaohua

[PATCH 1/3] kthread: add a mechanism to store cgroup info

2017-09-06 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

Re: [PATCH V2 3/3] block/loop: suppress discard IO error message

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 05:16:21PM +0800, Ming Lei wrote: > On Thu, Sep 7, 2017 at 8:13 AM, Shaohua Li wrote: > > From: Shaohua Li > > > > We don't know if fallocate really supports FALLOC_FL_PUNCH_HOLE till > > fallocate is called. If it doesn't support, lo

Re: [PATCH V2 0/3] block/loop: handle discard/zeroout error

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 03:20:01PM +0200, Ilya Dryomov wrote: > Hi Shaohua, > > You wrote: > > BTW: blkdev_issue_zeroout retries if we immediately find the device doesn't > > support zeroout, but it doesn't retry if submit_bio_wait returns > > -EOPNOTSUPP. > > Is this correct behavior? > > I sen

BDI_CAP_STABLE_WRITES for stacked device (Re: Enable skip_copy can cause data integrity issue in some storage) stack

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 11:11:24AM +1000, Neil Brown wrote: > On Wed, Sep 06 2017, Shaohua Li wrote: > > > On Fri, Sep 01, 2017 at 03:26:41PM +0800, alexwu wrote: > >> Hi, > >> > >> Recently a data integrity issue about skip_copy was found. We are able

Re: [PATCH 1/3] kthread: add a mechanism to store cgroup info

2017-09-08 Thread Shaohua Li
On Fri, Sep 08, 2017 at 07:35:37AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 06, 2017 at 07:00:51PM -0700, Shaohua Li wrote: > > +#ifdef CONFIG_CGROUPS > > +void kthread_set_orig_css(struct cgroup_subsys_state *css); > > +struct cgroup_subsys_state *kthread_ge

Re: [PATCH 3/3] block/loop: make loop cgroup aware

2017-09-08 Thread Shaohua Li
On Fri, Sep 08, 2017 at 07:48:09AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 06, 2017 at 07:00:53PM -0700, Shaohua Li wrote: > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index 9d4545f..9850b27 100644 > > --- a/drivers/block/loop.c >

[PATCH V2 0/4] block: make loop block device cgroup aware

2017-09-13 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

[PATCH V2 4/4] block/loop: make loop cgroup aware

2017-09-13 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH V2 3/4] block: make blkcg aware of kthread stored original cgroup info

2017-09-13 Thread Shaohua Li
From: Shaohua Li bio_blkcg is the only API to get cgroup info for a bio right now. If bio_blkcg finds current task is a kthread and has original blkcg associated, it will use the css instead of associating the bio to current task. This makes it possible that kthread dispatches bios on behalf of

[PATCH V2 1/4] kthread: add a mechanism to store cgroup info

2017-09-13 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

[PATCH V2 2/4] blkcg: delete unused APIs

2017-09-13 Thread Shaohua Li
From: Shaohua Li Nobody uses the APIs right now. Signed-off-by: Shaohua Li --- block/bio.c| 31 --- include/linux/bio.h| 2 -- include/linux/blk-cgroup.h | 12 3 files changed, 45 deletions(-) diff --git a/block/bio.c b/block

Re: [PATCH V2 1/4] kthread: add a mechanism to store cgroup info

2017-09-13 Thread Shaohua Li
On Wed, Sep 13, 2017 at 02:38:20PM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 13, 2017 at 02:01:26PM -0700, Shaohua Li wrote: > > diff --git a/kernel/kthread.c b/kernel/kthread.c > > index 26db528..3107eee 100644 > > --- a/kernel/kthread.c > > +++ b/ke

[PATCH V3 3/4] block: make blkcg aware of kthread stored original cgroup info

2017-09-14 Thread Shaohua Li
From: Shaohua Li bio_blkcg is the only API to get cgroup info for a bio right now. If bio_blkcg finds current task is a kthread and has original blkcg associated, it will use the css instead of associating the bio to current task. This makes it possible that kthread dispatches bios on behalf of

[PATCH V3 0/4] block: make loop block device cgroup aware

2017-09-14 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

[PATCH V3 4/4] block/loop: make loop cgroup aware

2017-09-14 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH V3 2/4] blkcg: delete unused APIs

2017-09-14 Thread Shaohua Li
From: Shaohua Li Nobody uses the APIs right now. Acked-by: Tejun Heo Signed-off-by: Shaohua Li --- block/bio.c| 31 --- include/linux/bio.h| 2 -- include/linux/blk-cgroup.h | 12 3 files changed, 45 deletions(-) diff --git a

[PATCH V3 1/4] kthread: add a mechanism to store cgroup info

2017-09-14 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

[PATCH] block: fix a crash caused by wrong API

2017-09-21 Thread Shaohua Li
part_stat_show takes a part device not a disk, so we should use part_to_disk. Fix: d62e26b3ffd2(block: pass in queue to inflight accounting) Cc: Bart Van Assche Cc: Omar Sandoval Signed-off-by: Shaohua Li --- block/partition-generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

Re: [PATCH V3 0/4] block: make loop block device cgroup aware

2017-09-25 Thread Shaohua Li
On Thu, Sep 14, 2017 at 02:02:03PM -0700, Shaohua Li wrote: > From: Shaohua Li > > Hi, > > The IO dispatched to under layer disk by loop block device isn't cloned from > original bio, so the IO loses cgroup information of original bio. These IO > escapes from cgroup c

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-25 Thread Shaohua Li
On Mon, Sep 25, 2017 at 06:46:42PM +0800, Joseph Qi wrote: > From: Joseph Qi > > Currently it will try to dispatch bio in throtl_upgrade_state. This may > lead to io stall in the following case. > Say the hierarchy is like: > /-test1 > |-subtest1 > and subtest1 has 32 queued bios now. > > thro

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-25 Thread Shaohua Li
On Tue, Sep 26, 2017 at 09:06:57AM +0800, Joseph Qi wrote: > Hi Shaohua, > > On 17/9/26 01:22, Shaohua Li wrote: > > On Mon, Sep 25, 2017 at 06:46:42PM +0800, Joseph Qi wrote: > >> From: Joseph Qi > >> > >> Currently it will try to dispatch bio in thro

[PATCH] block: fix a build error

2017-09-26 Thread Shaohua Li
The code is only for blkcg not for all cgroups Reported-by: kbuild test robot Signed-off-by: Shaohua Li --- drivers/block/loop.c| 2 +- include/linux/kthread.h | 2 +- kernel/kthread.c| 8 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/block/loop.c

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-27 Thread Shaohua Li
On Tue, Sep 26, 2017 at 11:16:05AM +0800, Joseph Qi wrote: > > > On 17/9/26 10:48, Shaohua Li wrote: > > On Tue, Sep 26, 2017 at 09:06:57AM +0800, Joseph Qi wrote: > >> Hi Shaohua, > >> > >> On 17/9/26 01:22, Shaohua Li wrote: > >>> On

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-28 Thread Shaohua Li
On Thu, Sep 28, 2017 at 07:19:45PM +0800, Joseph Qi wrote: > > > On 17/9/28 11:48, Joseph Qi wrote: > > Hi Shahua, > > > > On 17/9/28 05:38, Shaohua Li wrote: > >> On Tue, Sep 26, 2017 at 11:16:05AM +0800, Joseph Qi wrote: > >>> > >>>

Re: [PATCH v2] blk-throttle: fix possible io stall when upgrade to max

2017-10-01 Thread Shaohua Li
otl_schedule_next_dispatch(sq, true); > } > rcu_read_unlock(); > throtl_select_dispatch(&td->service_queue); > - throtl_schedule_next_dispatch(&td->service_queue, false); > + throtl_schedule_next_dispatch(&td->service_queue, true); > queue_work(kthrotld_workqueue, &td->dispatch_work); > } Reviewed-by: Shaohua Li

Re: [PATCH] null_blk: change configfs dependency to select

2017-10-03 Thread Shaohua Li
ded to debug, since it got killed when the config > updated after the configfs change was merged. > > Fixes: 3bf2bd20734e ("nullb: add configfs interface") > Signed-off-by: Jens Axboe Reviewed-by: Shaohua Li > diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig >

[PATCH V3 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-10-04 Thread Shaohua Li
From: Shaohua Li REQ_OP_WRITE_ZEROES really means zero the data. And in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li Reviewed-by: Ming Lei --- drivers/block/loop.

[PATCH V3 0/3] block/loop: handle discard/zeroout error

2017-10-04 Thread Shaohua Li
From: Shaohua Li Fix some problems when setting up loop device with a block device as back file and create/mount ext4 in the loop device. Thanks, Shaohua Shaohua Li (3): block/loop: don't hijack error number block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES block: don&#x

[PATCH V3 3/3] block: don't print message for discard error

2017-10-04 Thread Shaohua Li
From: Shaohua Li discard error isn't fatal, don't flood discard error messages. Suggested-by: Ming Lei Signed-off-by: Shaohua Li --- block/blk-core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 14f7674..adb064a 100644 --- a/block/

[PATCH V3 1/3] block/loop: don't hijack error number

2017-10-04 Thread Shaohua Li
From: Shaohua Li If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li Reviewed-by: Ming Lei --- drivers/block/loop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index bc

[RFC 1/2] block: record blkcss in request

2017-10-04 Thread Shaohua Li
From: Shaohua Li Currently we record block css info in bio but not in request. Normally we can get a request's css from its bio, but in some situations, we can't access request's bio, for example, after blk_update_request. Add the css to request, so we can access css through the

[RFC 0/2] block: export latency info for cgroups

2017-10-04 Thread Shaohua Li
From: Shaohua Li Hi, latency info is a good sign to determine if IO is healthy. The patches export such info to cgroup io.stat. Thanks, Shaohua Shaohua Li (2): block: record blkcss in request blockcg: export latency info for each cgroup block/blk-cgroup.c | 29

[RFC 2/2] blockcg: export latency info for each cgroup

2017-10-04 Thread Shaohua Li
From: Shaohua Li Export the latency info to user. The latency is a good sign to indicate if IO is congested or not. User can use the info to make decisions like adjust cgroup settings. Existing io.stat shows accumulated IO bytes and requests, but accumulated value for latency doesn't make

Re: [RFC 1/2] block: record blkcss in request

2017-10-04 Thread Shaohua Li
On Wed, Oct 04, 2017 at 10:51:49AM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Wed, Oct 04, 2017 at 10:41:19AM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently we record block css info in bio but not in request. Normally > > we can get a r

Re: [RFC 2/2] blockcg: export latency info for each cgroup

2017-10-05 Thread Shaohua Li
On Wed, Oct 04, 2017 at 11:04:39AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Oct 04, 2017 at 10:41:20AM -0700, Shaohua Li wrote: > > Export the latency info to user. The latency is a good sign to indicate > > if IO is congested or not. User can use the info to make deci

Re: [PATCH v2 0/9] Nowait support for stacked block devices

2017-10-05 Thread Shaohua Li
On Wed, Oct 04, 2017 at 08:55:02AM -0500, Goldwyn Rodrigues wrote: > This is a continuation of the nowait support which was incorporated > a while back. We introduced REQ_NOWAIT which would return immediately > if the call would block at the block layer. Request based-devices > do not wait. However

[PATCH] blk-stat: delete useless code

2017-10-05 Thread Shaohua Li
Fix two issues: - the per-cpu stat flush is unnecessary, nobody uses per-cpu stat except sum it to global stat. We can do the calculation there. The flush just wastes cpu time. - some fields are signed int/s64. I don't see the point. Cc: Omar Sandoval Signed-off-by: Shaohua Li ---

[PATCH V2 0/3] block: export latency info for cgroups

2017-10-06 Thread Shaohua Li
From: Shaohua Li Hi, latency info is a good sign to determine if IO is healthy. The patches export such info to cgroup io.stat. I sent the first patch separately before, but since the latter depends on it, I include it here. Thanks, Shaohua V1->V2: improve the scalability Shaohua Li

[PATCH V2 2/3] block: set request_list for request

2017-10-06 Thread Shaohua Li
From: Shaohua Li Legacy queue sets request's request_list, mq doesn't. This makes mq does the same thing, so we can find cgroup of a request. Note, we really only use blkg field of request_list, it's pointless to allocate mempool for request_list in mq case. Signed-off

[PATCH V2 1/3] blk-stat: delete useless code

2017-10-06 Thread Shaohua Li
From: Shaohua Li Fix two issues: - the per-cpu stat flush is unnecessary, nobody uses per-cpu stat except sum it to global stat. We can do the calculation there. The flush just wastes cpu time. - some fields are signed int/s64. I don't see the point. Cc: Omar Sandoval Signed-o

[PATCH V2 3/3] blockcg: export latency info for each cgroup

2017-10-06 Thread Shaohua Li
From: Shaohua Li Export the latency info to user. The latency is a good sign to indicate if IO is congested or not. User can use the info to make decisions like adjust cgroup settings. Existing io.stat shows accumulated IO bytes and requests, but accumulated value for latency doesn't make

Re: [PATCH] blk-throttle: fix null pointer dereference while throttling writeback IOs

2017-10-10 Thread Shaohua Li
} > > lat = finish_time - start_time; > /* this is only for bio based driver */ > @@ -2314,6 +2320,8 @@ void blk_throtl_bio_endio(struct bio *bio) > tg->bio_cnt /= 2; > tg->bad_bio_cnt /= 2; > } > + > + blkg_put(tg_to_blkg(tg)); > } > #endif Reviewed-by: Shaohua Li

Re: [PATCH V2 3/3] blockcg: export latency info for each cgroup

2017-10-10 Thread Shaohua Li
On Wed, Oct 11, 2017 at 01:35:51AM +0800, weiping zhang wrote: > On Fri, Oct 06, 2017 at 05:56:01PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Export the latency info to user. The latency is a good sign to indicate > > if IO is congested or not. User can use

Re: [PATCH] blk-throttle: fix null pointer dereference while throttling writeback IOs

2017-10-10 Thread Shaohua Li
On Tue, Oct 10, 2017 at 12:48:38PM -0600, Jens Axboe wrote: > On 10/10/2017 12:13 PM, Shaohua Li wrote: > > On Tue, Oct 10, 2017 at 11:13:32AM +0800, xuejiufei wrote: > >> From: Jiufei Xue > >> > >> A null pointer dereference can occur when blkcg is remo

<    1   2   3   4   5   >