Re: [PATCH V2] block-throttle: avoid double charge

2017-12-20 Thread Shaohua Li
haohua > On 2017/11/14 上午4:37, Shaohua Li wrote: > > If a bio is throttled and splitted after throttling, the bio could be > > resubmited and enters the throttling again. This will cause part of the > > bio is charged multiple times. If the cgroup has an IO limit, the dou

Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-19 Thread Shaohua Li
On Tue, Dec 19, 2017 at 10:17:43AM -0600, Bruno Wolff III wrote: > On Sun, Dec 17, 2017 at 21:43:50 +0800, > weiping zhang wrote: > > Hi, thanks for testing, I think you first reproduce this issue(got WARNING > > at device_add_disk) by your own build, then add my debug patch. > > The problem is

Re: [PATCH v2 2/2] blk-throttle: fix wrong initialization in case of dm device

2017-11-20 Thread Shaohua Li
On Mon, Nov 20, 2017 at 04:02:03PM -0500, Mike Snitzer wrote: > On Sun, Nov 19, 2017 at 9:00 PM, Joseph Qi wrote: > > From: Joseph Qi > > > > dm device set QUEUE_FLAG_NONROT in resume, which is after register > > queue. That is to mean, the previous initialization in > > blk_throtl_register_queue

Re: [PATCH v2 2/2] blk-throttle: fix wrong initialization in case of dm device

2017-11-20 Thread Shaohua Li
> + */ > + if (blk_queue_nonrot(blkg->q) && > + td->filtered_latency != LATENCY_FILTERED_SSD) { > + int i; > + > + td->throtl_slice = DFL_THROTL_SLICE_SSD; if CONFIG_BLK_DEV_THROTTLING_LOW isn't not s

Re: [RFC PATCH] blk-throttle: add burst allowance.

2017-11-17 Thread Shaohua Li
On Thu, Nov 16, 2017 at 08:25:58PM -0800, Khazhismel Kumykov wrote: > On Thu, Nov 16, 2017 at 8:50 AM, Shaohua Li wrote: > > On Tue, Nov 14, 2017 at 03:10:22PM -0800, Khazhismel Kumykov wrote: > >> Allows configuration additional bytes or ios before a throttle is > >>

Re: [RFC] md: make queue limits depending on limits of RAID members

2017-11-17 Thread Shaohua Li
On Wed, Nov 15, 2017 at 11:25:12AM +0100, Mariusz Dabrowski wrote: > Hi all, > > In order to be compliant with a pass-throug drive behavior, RAID queue > limits should be calculated in a way that minimal io, optimal io and > discard granularity size will be met from a single drive perspective. > C

Re: [RFC PATCH] blk-throttle: add burst allowance.

2017-11-16 Thread Shaohua Li
On Tue, Nov 14, 2017 at 03:10:22PM -0800, Khazhismel Kumykov wrote: > Allows configuration additional bytes or ios before a throttle is > triggered. > > This allows implementation of a bucket style rate-limit/throttle on a > block device. Previously, bursting to a device was limited to allowance >

[PATCH] blk-throttle: allow configure 0 for some settings --resend

2017-11-15 Thread Shaohua Li
For io.low, latency target 0 is legit. 0 for rbps/wbps/rios/wios is ok too. And we use 0 to clear io.low settings. Cc: Tejun Heo Signed-off-by: Shaohua Li --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c

[PATCH] blkcg: correctly update stat

2017-11-14 Thread Shaohua Li
esn't have BIO_THROTTLED flag. The fix will update stat ahead of bio skips from blk-throttle, but eventually the stat is correct. This patch is on top of patch: https://marc.info/?l=linux-block&m=151060860608914&w=2 Cc: Tejun Heo Signed-off-by: Shaohua Li --- include/linux/blk-cgro

[PATCH V2] block-throttle: avoid double charge

2017-11-13 Thread Shaohua Li
Heo Cc: Vivek Goyal Cc: sta...@vger.kernel.org Signed-off-by: Shaohua Li --- block/bio.c | 2 ++ block/blk-throttle.c | 8 +--- include/linux/bio.h | 2 ++ 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/block/bio.c b/block/bio.c index 8338304..d1d4d51 100644 --- a/bl

Re: [PATCH] null_blk: fix dev->badblocks leak

2017-11-08 Thread Shaohua Li
do_init_module+0x56/0x1e9 > [] load_module+0x1c47/0x26a0 > [] SyS_finit_module+0xa9/0xd0 > [] entry_SYSCALL_64_fastpath+0x13/0x94 > > Fixes: 2f54a613c942 ("nullb: badbblocks support") > Signed-off-by: David Disseldorp Reviewed-by: Shaohua Li > -

Re: [PATCH] badblocks: fix wrong return value in badblocks_set if badblocks are disabled

2017-11-03 Thread Shaohua Li
On Fri, Nov 03, 2017 at 10:13:38AM -0600, Liu Bo wrote: > Hi Shaohua, > > Given it's related to md, can you please take this thru your tree? Yes, the patch makes sense. Can you resend the patch to me? I can't find it in my inbox Thanks, Shaohua > Thanks, > > -liubo > > On Wed, Sep 27, 2017

Re: [PATCH V3 0/3] block/loop: handle discard/zeroout error

2017-10-18 Thread Shaohua Li
On Wed, Oct 04, 2017 at 07:52:42AM -0700, Shaohua Li wrote: > From: Shaohua Li > > Fix some problems when setting up loop device with a block device as back file > and create/mount ext4 in the loop device. Jens, can you look at these patches? Thanks, Shaohua > > Thanks, >

[PATCH] block-throttle: avoid double charge

2017-10-13 Thread Shaohua Li
ther disk, it's very unlikely only partno is changed. Some sort of this patch probably should go into stable since v4.2 Cc: Tejun Heo Cc: Vivek Goyal Signed-off-by: Shaohua Li --- block/bio.c | 3 +++ block/blk-throttle.c | 15 --- include/linux/blk_types.h |

Re: [PATCH v8 03/10] md: Neither resync nor reshape while the system is frozen

2017-10-12 Thread Shaohua Li
On Thu, Oct 12, 2017 at 04:59:01PM +, Bart Van Assche wrote: > On Wed, 2017-10-11 at 12:32 -0700, Shaohua Li wrote: > > On Wed, Oct 11, 2017 at 05:17:56PM +, Bart Van Assche wrote: > > > On Tue, 2017-10-10 at 18:48 -0700, Shaohua Li wrote: > > > > The proble

Re: [PATCH v2] blk-throttle: track read and write request individually

2017-10-12 Thread Shaohua Li
w limit and > wants its low limit to be guaranteed, which is not we expected in fact. > So track read and write request individually, which can bring more > precise latency control for low limit idle detection. > > Signed-off-by: Joseph Qi Reviewed-by: Shaoh

[PATCH] blk-throttle: allow configure 0 for some settings

2017-10-11 Thread Shaohua Li
For io.low, latency target 0 is legit. 0 for rbps/wbps/rios/wios is ok too as long as not all of them are 0. Signed-off-by: Shaohua Li --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 0fea76a..ee6d7b0

Re: [PATCH v8 03/10] md: Neither resync nor reshape while the system is frozen

2017-10-11 Thread Shaohua Li
On Wed, Oct 11, 2017 at 05:17:56PM +, Bart Van Assche wrote: > On Tue, 2017-10-10 at 18:48 -0700, Shaohua Li wrote: > > The problem is __md_stop_writes set some bit like MD_RECOVERY_FROZEN, which > > will prevent md_check_recovery restarting resync/reshape. I think we need >

Re: [PATCH] blk-throttle: track read and write request individually

2017-10-11 Thread Shaohua Li
On Wed, Oct 11, 2017 at 05:53:34PM +0800, Joseph Qi wrote: > From: Joseph Qi > > In mixed read/write workload on SSD, write latency is much lower than > read. But now we only track and record read latency and then use it as > threshold base for both read and write io latency accounting. As a > re

Re: [PATCH v8 03/10] md: Neither resync nor reshape while the system is frozen

2017-10-10 Thread Shaohua Li
On Tue, Oct 10, 2017 at 11:33:06PM +, Bart Van Assche wrote: > On Tue, 2017-10-10 at 15:30 -0700, Shaohua Li wrote: > > On Tue, Oct 10, 2017 at 02:03:39PM -0700, Bart Van Assche wrote: > > > Some people use the md driver on laptops and use the suspend and > > > res

Re: [PATCH v2 0/9] Nowait support for stacked block devices

2017-10-10 Thread Shaohua Li
On Fri, Oct 06, 2017 at 07:01:19AM -0500, Goldwyn Rodrigues wrote: > > > On 10/05/2017 12:19 PM, Shaohua Li wrote: > > On Wed, Oct 04, 2017 at 08:55:02AM -0500, Goldwyn Rodrigues wrote: > >> This is a continuation of the nowait support which was incorporated > >

Re: [PATCH v8 03/10] md: Neither resync nor reshape while the system is frozen

2017-10-10 Thread Shaohua Li
Signed-off-by: Bart Van Assche > Cc: Shaohua Li > Cc: linux-r...@vger.kernel.org > Cc: Ming Lei > Cc: Christoph Hellwig > Cc: Hannes Reinecke > Cc: Johannes Thumshirn > --- > drivers/md/md.c | 30 +- > 1 file changed, 29 insertions(+), 1 d

Re: [PATCH] blk-throttle: fix null pointer dereference while throttling writeback IOs

2017-10-10 Thread Shaohua Li
On Tue, Oct 10, 2017 at 12:48:38PM -0600, Jens Axboe wrote: > On 10/10/2017 12:13 PM, Shaohua Li wrote: > > On Tue, Oct 10, 2017 at 11:13:32AM +0800, xuejiufei wrote: > >> From: Jiufei Xue > >> > >> A null pointer dereference can occur when blkcg is remo

Re: [PATCH V2 3/3] blockcg: export latency info for each cgroup

2017-10-10 Thread Shaohua Li
On Wed, Oct 11, 2017 at 01:35:51AM +0800, weiping zhang wrote: > On Fri, Oct 06, 2017 at 05:56:01PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Export the latency info to user. The latency is a good sign to indicate > > if IO is congested or not. User can use

Re: [PATCH] blk-throttle: fix null pointer dereference while throttling writeback IOs

2017-10-10 Thread Shaohua Li
} > > lat = finish_time - start_time; > /* this is only for bio based driver */ > @@ -2314,6 +2320,8 @@ void blk_throtl_bio_endio(struct bio *bio) > tg->bio_cnt /= 2; > tg->bad_bio_cnt /= 2; > } > + > + blkg_put(tg_to_blkg(tg)); > } > #endif Reviewed-by: Shaohua Li

[PATCH V2 1/3] blk-stat: delete useless code

2017-10-06 Thread Shaohua Li
From: Shaohua Li Fix two issues: - the per-cpu stat flush is unnecessary, nobody uses per-cpu stat except sum it to global stat. We can do the calculation there. The flush just wastes cpu time. - some fields are signed int/s64. I don't see the point. Cc: Omar Sandoval Signed-o

[PATCH V2 3/3] blockcg: export latency info for each cgroup

2017-10-06 Thread Shaohua Li
From: Shaohua Li Export the latency info to user. The latency is a good sign to indicate if IO is congested or not. User can use the info to make decisions like adjust cgroup settings. Existing io.stat shows accumulated IO bytes and requests, but accumulated value for latency doesn't make

[PATCH V2 0/3] block: export latency info for cgroups

2017-10-06 Thread Shaohua Li
From: Shaohua Li Hi, latency info is a good sign to determine if IO is healthy. The patches export such info to cgroup io.stat. I sent the first patch separately before, but since the latter depends on it, I include it here. Thanks, Shaohua V1->V2: improve the scalability Shaohua Li

[PATCH V2 2/3] block: set request_list for request

2017-10-06 Thread Shaohua Li
From: Shaohua Li Legacy queue sets request's request_list, mq doesn't. This makes mq does the same thing, so we can find cgroup of a request. Note, we really only use blkg field of request_list, it's pointless to allocate mempool for request_list in mq case. Signed-off

[PATCH] blk-stat: delete useless code

2017-10-05 Thread Shaohua Li
Fix two issues: - the per-cpu stat flush is unnecessary, nobody uses per-cpu stat except sum it to global stat. We can do the calculation there. The flush just wastes cpu time. - some fields are signed int/s64. I don't see the point. Cc: Omar Sandoval Signed-off-by: Shaohua Li ---

Re: [PATCH v2 0/9] Nowait support for stacked block devices

2017-10-05 Thread Shaohua Li
On Wed, Oct 04, 2017 at 08:55:02AM -0500, Goldwyn Rodrigues wrote: > This is a continuation of the nowait support which was incorporated > a while back. We introduced REQ_NOWAIT which would return immediately > if the call would block at the block layer. Request based-devices > do not wait. However

Re: [RFC 2/2] blockcg: export latency info for each cgroup

2017-10-05 Thread Shaohua Li
On Wed, Oct 04, 2017 at 11:04:39AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Oct 04, 2017 at 10:41:20AM -0700, Shaohua Li wrote: > > Export the latency info to user. The latency is a good sign to indicate > > if IO is congested or not. User can use the info to make deci

Re: [RFC 1/2] block: record blkcss in request

2017-10-04 Thread Shaohua Li
On Wed, Oct 04, 2017 at 10:51:49AM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Wed, Oct 04, 2017 at 10:41:19AM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently we record block css info in bio but not in request. Normally > > we can get a r

[RFC 2/2] blockcg: export latency info for each cgroup

2017-10-04 Thread Shaohua Li
From: Shaohua Li Export the latency info to user. The latency is a good sign to indicate if IO is congested or not. User can use the info to make decisions like adjust cgroup settings. Existing io.stat shows accumulated IO bytes and requests, but accumulated value for latency doesn't make

[RFC 1/2] block: record blkcss in request

2017-10-04 Thread Shaohua Li
From: Shaohua Li Currently we record block css info in bio but not in request. Normally we can get a request's css from its bio, but in some situations, we can't access request's bio, for example, after blk_update_request. Add the css to request, so we can access css through the

[RFC 0/2] block: export latency info for cgroups

2017-10-04 Thread Shaohua Li
From: Shaohua Li Hi, latency info is a good sign to determine if IO is healthy. The patches export such info to cgroup io.stat. Thanks, Shaohua Shaohua Li (2): block: record blkcss in request blockcg: export latency info for each cgroup block/blk-cgroup.c | 29

[PATCH V3 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-10-04 Thread Shaohua Li
From: Shaohua Li REQ_OP_WRITE_ZEROES really means zero the data. And in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li Reviewed-by: Ming Lei --- drivers/block/loop.

[PATCH V3 0/3] block/loop: handle discard/zeroout error

2017-10-04 Thread Shaohua Li
From: Shaohua Li Fix some problems when setting up loop device with a block device as back file and create/mount ext4 in the loop device. Thanks, Shaohua Shaohua Li (3): block/loop: don't hijack error number block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES block: don&#x

[PATCH V3 3/3] block: don't print message for discard error

2017-10-04 Thread Shaohua Li
From: Shaohua Li discard error isn't fatal, don't flood discard error messages. Suggested-by: Ming Lei Signed-off-by: Shaohua Li --- block/blk-core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 14f7674..adb064a 100644 --- a/block/

[PATCH V3 1/3] block/loop: don't hijack error number

2017-10-04 Thread Shaohua Li
From: Shaohua Li If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li Reviewed-by: Ming Lei --- drivers/block/loop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index bc

Re: [PATCH] null_blk: change configfs dependency to select

2017-10-03 Thread Shaohua Li
ded to debug, since it got killed when the config > updated after the configfs change was merged. > > Fixes: 3bf2bd20734e ("nullb: add configfs interface") > Signed-off-by: Jens Axboe Reviewed-by: Shaohua Li > diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig >

Re: [PATCH v2] blk-throttle: fix possible io stall when upgrade to max

2017-10-01 Thread Shaohua Li
otl_schedule_next_dispatch(sq, true); > } > rcu_read_unlock(); > throtl_select_dispatch(&td->service_queue); > - throtl_schedule_next_dispatch(&td->service_queue, false); > + throtl_schedule_next_dispatch(&td->service_queue, true); > queue_work(kthrotld_workqueue, &td->dispatch_work); > } Reviewed-by: Shaohua Li

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-28 Thread Shaohua Li
On Thu, Sep 28, 2017 at 07:19:45PM +0800, Joseph Qi wrote: > > > On 17/9/28 11:48, Joseph Qi wrote: > > Hi Shahua, > > > > On 17/9/28 05:38, Shaohua Li wrote: > >> On Tue, Sep 26, 2017 at 11:16:05AM +0800, Joseph Qi wrote: > >>> > >>>

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-27 Thread Shaohua Li
On Tue, Sep 26, 2017 at 11:16:05AM +0800, Joseph Qi wrote: > > > On 17/9/26 10:48, Shaohua Li wrote: > > On Tue, Sep 26, 2017 at 09:06:57AM +0800, Joseph Qi wrote: > >> Hi Shaohua, > >> > >> On 17/9/26 01:22, Shaohua Li wrote: > >>> On

[PATCH] block: fix a build error

2017-09-26 Thread Shaohua Li
The code is only for blkcg not for all cgroups Reported-by: kbuild test robot Signed-off-by: Shaohua Li --- drivers/block/loop.c| 2 +- include/linux/kthread.h | 2 +- kernel/kthread.c| 8 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/block/loop.c

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-25 Thread Shaohua Li
On Tue, Sep 26, 2017 at 09:06:57AM +0800, Joseph Qi wrote: > Hi Shaohua, > > On 17/9/26 01:22, Shaohua Li wrote: > > On Mon, Sep 25, 2017 at 06:46:42PM +0800, Joseph Qi wrote: > >> From: Joseph Qi > >> > >> Currently it will try to dispatch bio in thro

Re: [PATCH] blk-throttle: fix possible io stall when doing upgrade

2017-09-25 Thread Shaohua Li
On Mon, Sep 25, 2017 at 06:46:42PM +0800, Joseph Qi wrote: > From: Joseph Qi > > Currently it will try to dispatch bio in throtl_upgrade_state. This may > lead to io stall in the following case. > Say the hierarchy is like: > /-test1 > |-subtest1 > and subtest1 has 32 queued bios now. > > thro

Re: [PATCH V3 0/4] block: make loop block device cgroup aware

2017-09-25 Thread Shaohua Li
On Thu, Sep 14, 2017 at 02:02:03PM -0700, Shaohua Li wrote: > From: Shaohua Li > > Hi, > > The IO dispatched to under layer disk by loop block device isn't cloned from > original bio, so the IO loses cgroup information of original bio. These IO > escapes from cgroup c

[PATCH] block: fix a crash caused by wrong API

2017-09-21 Thread Shaohua Li
part_stat_show takes a part device not a disk, so we should use part_to_disk. Fix: d62e26b3ffd2(block: pass in queue to inflight accounting) Cc: Bart Van Assche Cc: Omar Sandoval Signed-off-by: Shaohua Li --- block/partition-generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH V3 1/4] kthread: add a mechanism to store cgroup info

2017-09-14 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

[PATCH V3 2/4] blkcg: delete unused APIs

2017-09-14 Thread Shaohua Li
From: Shaohua Li Nobody uses the APIs right now. Acked-by: Tejun Heo Signed-off-by: Shaohua Li --- block/bio.c| 31 --- include/linux/bio.h| 2 -- include/linux/blk-cgroup.h | 12 3 files changed, 45 deletions(-) diff --git a

[PATCH V3 4/4] block/loop: make loop cgroup aware

2017-09-14 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH V3 0/4] block: make loop block device cgroup aware

2017-09-14 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

[PATCH V3 3/4] block: make blkcg aware of kthread stored original cgroup info

2017-09-14 Thread Shaohua Li
From: Shaohua Li bio_blkcg is the only API to get cgroup info for a bio right now. If bio_blkcg finds current task is a kthread and has original blkcg associated, it will use the css instead of associating the bio to current task. This makes it possible that kthread dispatches bios on behalf of

Re: [PATCH V2 1/4] kthread: add a mechanism to store cgroup info

2017-09-13 Thread Shaohua Li
On Wed, Sep 13, 2017 at 02:38:20PM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 13, 2017 at 02:01:26PM -0700, Shaohua Li wrote: > > diff --git a/kernel/kthread.c b/kernel/kthread.c > > index 26db528..3107eee 100644 > > --- a/kernel/kthread.c > > +++ b/ke

[PATCH V2 2/4] blkcg: delete unused APIs

2017-09-13 Thread Shaohua Li
From: Shaohua Li Nobody uses the APIs right now. Signed-off-by: Shaohua Li --- block/bio.c| 31 --- include/linux/bio.h| 2 -- include/linux/blk-cgroup.h | 12 3 files changed, 45 deletions(-) diff --git a/block/bio.c b/block

[PATCH V2 1/4] kthread: add a mechanism to store cgroup info

2017-09-13 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

[PATCH V2 3/4] block: make blkcg aware of kthread stored original cgroup info

2017-09-13 Thread Shaohua Li
From: Shaohua Li bio_blkcg is the only API to get cgroup info for a bio right now. If bio_blkcg finds current task is a kthread and has original blkcg associated, it will use the css instead of associating the bio to current task. This makes it possible that kthread dispatches bios on behalf of

[PATCH V2 4/4] block/loop: make loop cgroup aware

2017-09-13 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH V2 0/4] block: make loop block device cgroup aware

2017-09-13 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

Re: [PATCH 3/3] block/loop: make loop cgroup aware

2017-09-08 Thread Shaohua Li
On Fri, Sep 08, 2017 at 07:48:09AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 06, 2017 at 07:00:53PM -0700, Shaohua Li wrote: > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index 9d4545f..9850b27 100644 > > --- a/drivers/block/loop.c >

Re: [PATCH 1/3] kthread: add a mechanism to store cgroup info

2017-09-08 Thread Shaohua Li
On Fri, Sep 08, 2017 at 07:35:37AM -0700, Tejun Heo wrote: > Hello, > > On Wed, Sep 06, 2017 at 07:00:51PM -0700, Shaohua Li wrote: > > +#ifdef CONFIG_CGROUPS > > +void kthread_set_orig_css(struct cgroup_subsys_state *css); > > +struct cgroup_subsys_state *kthread_ge

BDI_CAP_STABLE_WRITES for stacked device (Re: Enable skip_copy can cause data integrity issue in some storage) stack

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 11:11:24AM +1000, Neil Brown wrote: > On Wed, Sep 06 2017, Shaohua Li wrote: > > > On Fri, Sep 01, 2017 at 03:26:41PM +0800, alexwu wrote: > >> Hi, > >> > >> Recently a data integrity issue about skip_copy was found. We are able

Re: [PATCH V2 0/3] block/loop: handle discard/zeroout error

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 03:20:01PM +0200, Ilya Dryomov wrote: > Hi Shaohua, > > You wrote: > > BTW: blkdev_issue_zeroout retries if we immediately find the device doesn't > > support zeroout, but it doesn't retry if submit_bio_wait returns > > -EOPNOTSUPP. > > Is this correct behavior? > > I sen

Re: [PATCH V2 3/3] block/loop: suppress discard IO error message

2017-09-07 Thread Shaohua Li
On Thu, Sep 07, 2017 at 05:16:21PM +0800, Ming Lei wrote: > On Thu, Sep 7, 2017 at 8:13 AM, Shaohua Li wrote: > > From: Shaohua Li > > > > We don't know if fallocate really supports FALLOC_FL_PUNCH_HOLE till > > fallocate is called. If it doesn't support, lo

[PATCH 1/3] kthread: add a mechanism to store cgroup info

2017-09-06 Thread Shaohua Li
From: Shaohua Li kthread usually runs jobs on behalf of other threads. The jobs should be charged to cgroup of original threads. But the jobs run in a kthread, where we lose the cgroup context of original threads. The patch adds a machanism to record cgroup info of original threads in kthread

[PATCH 2/3] block: make blkcg aware of kthread stored original cgroup info

2017-09-06 Thread Shaohua Li
From: Shaohua Li Several blkcg APIs are deprecated. After removing them, bio_blkcg is the only API to get cgroup info for a bio. If bio_blkcg finds current task is a kthread and has original css recorded, it will use the css instead of associating the bio to current task. Signed-off-by: Shaohua

[PATCH 3/3] block/loop: make loop cgroup aware

2017-09-06 Thread Shaohua Li
From: Shaohua Li loop block device handles IO in a separate thread. The actual IO dispatched isn't cloned from the IO loop device received, so the dispatched IO loses the cgroup context. I'm ignoring buffer IO case now, which is quite complicated. Making the loop thread aware cgro

[PATCH 0/3] block: make loop block device cgroup aware

2017-09-06 Thread Shaohua Li
From: Shaohua Li Hi, The IO dispatched to under layer disk by loop block device isn't cloned from original bio, so the IO loses cgroup information of original bio. These IO escapes from cgroup control. The patches try to address this issue. The idea is quite generic, but we currently only

[PATCH V2 3/3] block/loop: suppress discard IO error message

2017-09-06 Thread Shaohua Li
From: Shaohua Li We don't know if fallocate really supports FALLOC_FL_PUNCH_HOLE till fallocate is called. If it doesn't support, loop will return -EOPNOTSUPP and we see a lot of error message printed by blk_update_request. Failure for discard IO isn't a big problem, so we just r

[PATCH V2 1/3] block/loop: don't hijack error number

2017-09-06 Thread Shaohua Li
From: Shaohua Li If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li --- drivers/block/loop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 85de673..715b762 100644

[PATCH V2 0/3] block/loop: handle discard/zeroout error

2017-09-06 Thread Shaohua Li
From: Shaohua Li Fix some problems when setting up loop device with a block device as back file and create/mount ext4 in the loop device. BTW: blkdev_issue_zeroout retries if we immediately find the device doesn't support zeroout, but it doesn't retry if submit_bio_wait returns -EOPN

[PATCH V2 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-09-06 Thread Shaohua Li
From: Shaohua Li REQ_OP_WRITE_ZEROES really means zero the data. And in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file chang

Re: [PATCH V6 00/18] blk-throttle: add .low limit

2017-09-06 Thread Shaohua Li
On Wed, Sep 06, 2017 at 09:12:20AM +0800, Joseph Qi wrote: > Hi Shaohua, > > On 17/9/6 05:02, Shaohua Li wrote: > > On Thu, Aug 31, 2017 at 09:24:23AM +0200, Paolo VALENTE wrote: > >> > >>> Il giorno 15 gen 2017, alle ore 04:42, Shaohua Li ha > >>

Re: Enable skip_copy can cause data integrity issue in some storage stack

2017-09-06 Thread Shaohua Li
On Fri, Sep 01, 2017 at 03:26:41PM +0800, alexwu wrote: > Hi, > > Recently a data integrity issue about skip_copy was found. We are able > to reproduce it and found the root cause. This data integrity issue > might happen if there are other layers between file system and raid5. > > [How to Reprod

Re: [PATCH V6 00/18] blk-throttle: add .low limit

2017-09-05 Thread Shaohua Li
On Thu, Aug 31, 2017 at 09:24:23AM +0200, Paolo VALENTE wrote: > > > Il giorno 15 gen 2017, alle ore 04:42, Shaohua Li ha scritto: > > > > Hi, > > > > cgroup still lacks a good iocontroller. CFQ works well for hard disk, but > > not > > much for

[PATCH V2 1/2] block/loop: fix use after free

2017-09-01 Thread Shaohua Li
From: Shaohua Li lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 16 +--- driv

[PATCH V2 2/2] block/loop: remove unused field

2017-09-01 Thread Shaohua Li
From: Shaohua Li nobody uses the list. Signed-off-by: Shaohua Li --- drivers/block/loop.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/loop.h b/drivers/block/loop.h index b0ba4a5..f68c1d5 100644 --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -67,7 +67,6 @@ struct

[PATCH V3 2/2] block/loop: allow request merge for directio mode

2017-08-31 Thread Shaohua Li
From: Shaohua Li Currently loop disables merge. While it makes sense for buffer IO mode, directio mode can benefit from request merge. Without merge, loop could send small size IO to underlayer disk and harm performance. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block

[PATCH V3 1/2] block/loop: set hw_sectors

2017-08-31 Thread Shaohua Li
From: Shaohua Li Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Reviewed-by: Omar Sandoval Reviewed-by: Ming Lei Signed-off-by: Shaohua Li

[PATCH V3 0/2] block/loop: improve performance

2017-08-31 Thread Shaohua Li
From: Shaohua Li two small patches to improve performance for loop in directio mode. The goal is to increase IO size sending to underlayer disks. Thanks, Shaohua V2 -> V3: - Use GFP_NOIO pointed out by Ming - Rebase to latest for-next branch Shaohua Li (2): block/loop: set hw_sect

Re: [PATCH] block/loop: fix use after feee

2017-08-30 Thread Shaohua Li
On Wed, Aug 30, 2017 at 02:51:05PM -0700, Shaohua Li wrote: > lo_rw_aio->call_read_iter-> > 1 aops->direct_IO > 2 iov_iter_revert > lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could > be freed before 2, which accesses bvec. please ignore this

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-30 Thread Shaohua Li
On Wed, Aug 30, 2017 at 02:43:40PM +0800, Ming Lei wrote: > On Tue, Aug 29, 2017 at 09:43:20PM -0700, Shaohua Li wrote: > > On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote: > > > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote: > > > > On Tue, Au

[PATCH] block/loop: fix use after feee

2017-08-30 Thread Shaohua Li
lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. This conflicts with my direcio performance improvement patches, which I'll resend. Signed-off-

[PATCH] block/loop: fix use after free

2017-08-30 Thread Shaohua Li
lo_rw_aio->call_read_iter-> 1 aops->direct_IO 2 iov_iter_revert lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could be freed before 2, which accesses bvec. This conflicts with my direcio performance improvement patches, which I'll resend. Signed-off-

[PATCH 3/3] block/loop: suppress discard IO error message

2017-08-30 Thread Shaohua Li
h will suppress the IO error message Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index a30aa45..15f51e3 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -441,6 +441,9 @@ static

[PATCH 0/3]block/loop: handle discard/zeroout error

2017-08-30 Thread Shaohua Li
is correct behavior? Thanks, Shaohua Shaohua Li (3): block/loop: don't hijack error number block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES block/loop: suppress discard IO error message drivers/block/loop.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) -- 2.9.5

[PATCH 1/3] block/loop: don't hijack error number

2017-08-30 Thread Shaohua Li
If the bio returns -EOPNOTSUPP, we shouldn't hijack it and return -EIO Signed-off-by: Shaohua Li --- drivers/block/loop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index ef83349..054dccc 100644 --- a/drivers/block/l

[PATCH 2/3] block/loop: use FALLOC_FL_ZERO_RANGE for REQ_OP_WRITE_ZEROES

2017-08-30 Thread Shaohua Li
REQ_OP_WRITE_ZEROES really means zero the data. And the in blkdev_fallocate, FALLOC_FL_ZERO_RANGE will retry but FALLOC_FL_PUNCH_HOLE not, even loop request doesn't have BLKDEV_ZERO_NOFALLBACK set. Signed-off-by: Shaohua Li --- drivers/block/loop.c | 3 +++ 1 file changed, 3 insertions(+)

Re: [RFC] block/loop: make loop cgroup aware

2017-08-29 Thread Shaohua Li
On Tue, Aug 29, 2017 at 08:28:09AM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Tue, Aug 29, 2017 at 08:22:36AM -0700, Shaohua Li wrote: > > > Yeah, writeback tracks the most active cgroup and associates writeback > > > ios with that cgroup. For buffered loop de

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-29 Thread Shaohua Li
On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote: > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote: > > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote: > > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote: > &

Re: [RFC] block/loop: make loop cgroup aware

2017-08-29 Thread Shaohua Li
On Mon, Aug 28, 2017 at 03:54:59PM -0700, Tejun Heo wrote: > Hello, Shaohua. > > On Wed, Aug 23, 2017 at 11:15:15AM -0700, Shaohua Li wrote: > > loop block device handles IO in a separate thread. The actual IO > > dispatched isn't cloned from the IO loop device received

Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-29 Thread Shaohua Li
On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote: > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently loop disables merge. While it makes sense for buffer IO mode, > > directio mode can benefit from request merge

Re: [PATCH] block/nullb: delete unnecessary memory free

2017-08-28 Thread Shaohua Li
On Mon, Aug 28, 2017 at 02:55:59PM -0600, Jens Axboe wrote: > On 08/28/2017 02:49 PM, Shaohua Li wrote: > > Commit 2984c86(nullb: factor disk parameters) has a typo. The > > nullb_device allocation/free is done outside of null_add_dev. The commit > > accidentally frees the

[PATCH] block/nullb: delete unnecessary memory free

2017-08-28 Thread Shaohua Li
Commit 2984c86(nullb: factor disk parameters) has a typo. The nullb_device allocation/free is done outside of null_add_dev. The commit accidentally frees the nullb_device in error code path. Reported-by: Dan Carpenter Signed-off-by: Shaohua Li --- drivers/block/null_blk.c | 1 - 1 file changed

[PATCH] block/nullb: fix NULL deference

2017-08-25 Thread Shaohua Li
L here. Reported-by: Dan Carpenter Signed-off-by: Shaohua Li --- drivers/block/null_blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c index 2032360..4d328e3 100644 --- a/drivers/block/null_blk.c +++ b/drivers/block/null_blk

[PATCH V2 2/2] block/loop: allow request merge for directio mode

2017-08-24 Thread Shaohua Li
From: Shaohua Li Currently loop disables merge. While it makes sense for buffer IO mode, directio mode can benefit from request merge. Without merge, loop could send small size IO to underlayer disk and harm performance. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block

[PATCH V2 1/2] block/loop: set hw_sectors

2017-08-24 Thread Shaohua Li
From: Shaohua Li Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Reviewed-by: Omar Sandoval Signed-off-by: Shaohua Li --- drivers/block/loop.c | 1

[PATCH V2 0/2] block/loop: improve performance

2017-08-24 Thread Shaohua Li
From: Shaohua Li two small patches to improve performance for loop in directio mode. The goal is to increase IO size sending to underlayer disks. As Omar pointed out, the patches have slight conflict with his, but should be easy to fix. Thanks, Shaohua Shaohua Li (2): block/loop: set

Re: [PATCH 2/2] block/loop: allow request merge for directio mode

2017-08-24 Thread Shaohua Li
On Thu, Aug 24, 2017 at 10:57:39AM -0700, Omar Sandoval wrote: > On Wed, Aug 23, 2017 at 04:49:24PM -0700, Shaohua Li wrote: > > From: Shaohua Li > > > > Currently loop disables merge. While it makes sense for buffer IO mode, > > directio mode can benefit from request

  1   2   3   4   5   >