nvme_queue is per-cpu queue (mostly). Allocating it in node where blk-mq
will use it.
Signed-off-by: Shaohua Li <s...@fb.com>
---
drivers/nvme/host/pci.c | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host
blk_mq_tags/requests of specific hardware queue are mostly used in
specific cpus, which might not be in the same numa node as disk. For
example, a nvme card is in node 0. half hardware queue will be used by
node 0, the other node 1.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-mq.
On Fri, Feb 17, 2017 at 09:25:27AM +0800, Ming Lei wrote:
> Hi Shaohua,
>
> On Fri, Feb 17, 2017 at 6:16 AM, Shaohua Li <s...@kernel.org> wrote:
> > On Thu, Feb 16, 2017 at 07:45:30PM +0800, Ming Lei wrote:
> >> In MD's resync I/O path, there are lots of direct a
On Tue, Feb 14, 2017 at 08:01:09AM -0800, Christoph Hellwig wrote:
> On Tue, Feb 14, 2017 at 11:29:03PM +0800, Ming Lei wrote:
> > Firstly bio_clone_mddev() is used in raid normal I/O and isn't
> > in resync I/O path.
> >
> > Secondly all the direct access to bvec table in raid happens on
> >
On Tue, Feb 14, 2017 at 11:28:58PM +0800, Ming Lei wrote:
> Hi,
>
> This patches replaces bio_clone() with bio_fast_clone() in
> bio_clone_mddev() because:
>
> 1) bio_clone_mddev() is used in raid normal I/O and isn't in
> resync I/O path, and all the direct access to bvec table in
> raid
On Wed, Feb 15, 2017 at 11:20:25AM -0800, Shaohua Li wrote:
> On Tue, Feb 14, 2017 at 08:01:09AM -0800, Christoph Hellwig wrote:
> > On Tue, Feb 14, 2017 at 11:29:03PM +0800, Ming Lei wrote:
> > > Firstly bio_clone_mddev() is used in raid normal I/O and isn't
> &
On Tue, Feb 28, 2017 at 11:41:35PM +0800, Ming Lei wrote:
> This patch gets each page's reference of each bio for resync,
> then r1buf_pool_free() gets simplified a lot.
>
> The same policy has been taken in raid10's buf pool allocation/free
> too.
We are going to delete the code, this simplify
eset any more.
>
> This patch can be thought as a cleanup too
>
> Suggested-by: Shaohua Li <s...@kernel.org>
> Signed-off-by: Ming Lei <tom.leim...@gmail.com>
> ---
> drivers/md/raid1.c | 83
> ++
> 1 file
hly new now in these functions
> and not necessary to reset any more.
>
> This patch can be thought as cleanup too.
>
> Suggested-by: Shaohua Li <s...@kernel.org>
> Signed-off-by: Ming Lei <tom.leim...@gmail.com>
> ---
> drivers/md/raid10.c | 125
> +++
On Tue, Feb 28, 2017 at 11:41:38PM +0800, Ming Lei wrote:
> Avoid to direct access to bvec table.
>
> Signed-off-by: Ming Lei
> ---
> drivers/md/raid1.c | 12
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/md/raid1.c
On Tue, Feb 28, 2017 at 11:41:34PM +0800, Ming Lei wrote:
> Now resync I/O use bio's bec table to manage pages,
> this way is very hacky, and may not work any more
> once multipage bvec is introduced.
>
> So introduce helpers and new data structure for
> managing resync I/O pages more cleanly.
>
On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote:
> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote:
> > Hi,
> >
> > while trying Kernel 4.8-rc5 my raid5 breaks every few minutes.
> >
> > Trace:
> > [ cut here ]
> > kernel BUG at
On Fri, Sep 09, 2016 at 08:03:42PM +0200, Stefan Priebe - Profihost AG wrote:
> Am 08.09.2016 um 19:33 schrieb Shaohua Li:
> > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote:
> >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote:
> >>> Hi,
>
.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-sysfs.c| 11
block/blk-throttle.c | 72
block/blk.h | 3 +++
3 files changed, 64 insertions(+), 22 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/blk-thro
Add high limit for cgroup and corresponding cgroup interface.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 139 +++
1 file changed, 107 insertions(+), 32 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-thro
is hard. This patch handles a simple case, a cgroup doesn't
dispatch any IO. We ignore such cgroup's limit, so other cgroups can use
the bandwidth.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)
diff
s
something we pay for sharing.
Note this doesn't completely avoid cgroup running under its high limit.
The best way to guarantee cgroup doesn't run under its limit is to set
max limit. For example, if we set cg1 max limit to 40, cg2 will never
run under its high limit.
Signed-off-by: Shaohua L
9=2
--
Shaohua Li (11):
block-throttle: prepare support multiple limits
block-throttle: add .high interface
block-throttle: configure bps/iops limit for cgroup in high limit
block-throttle: add upgrade logic for LIMIT_HIGH state
block-throttle:
(by
default 50us). 50us is choosen arbitrarily so far, but seems ok in
test and should allow the cpu does a lot of things before dispatch IO.
There is a knob to let user configure the threshold too.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/bio.c | 2 ++
block/blk-s
limit, we can upgrade queue state. The other case is
children has higher high limit than parent. Children's high limit is
meaningless. As long as parent's bps/iops cross high limit, we can
upgrade queue state.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 30 ++
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/block/blk-throttle.c b/blo
On Wed, Oct 05, 2016 at 10:49:46AM -0400, Tejun Heo wrote:
> Hello, Paolo.
>
> On Wed, Oct 05, 2016 at 02:37:00PM +0200, Paolo Valente wrote:
> > In this respect, for your generic, unpredictable scenario to make
> > sense, there must exist at least one real system that meets the
> > requirements
On Wed, Oct 05, 2016 at 11:30:53AM -0700, Shaohua Li wrote:
> On Wed, Oct 05, 2016 at 10:49:46AM -0400, Tejun Heo wrote:
> > Hello, Paolo.
> >
> > On Wed, Oct 05, 2016 at 02:37:00PM +0200, Paolo Valente wrote:
> > > In this respect, for your generic, unpredictabl
On Wed, Oct 05, 2016 at 09:57:22PM +0200, Paolo Valente wrote:
>
> > Il giorno 05 ott 2016, alle ore 21:08, Shaohua Li <s...@fb.com> ha scritto:
> >
> > On Wed, Oct 05, 2016 at 11:30:53AM -0700, Shaohua Li wrote:
> >> On Wed, Oct 05, 2016 at 10:49:46AM -0400,
On Wed, Oct 05, 2016 at 09:47:19PM +0200, Paolo Valente wrote:
>
> > Il giorno 05 ott 2016, alle ore 20:30, Shaohua Li <s...@fb.com> ha scritto:
> >
> > On Wed, Oct 05, 2016 at 10:49:46AM -0400, Tejun Heo wrote:
> >> Hello, Paolo.
> >>
> &g
.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-sysfs.c| 11
block/blk-throttle.c | 72
block/blk.h | 3 +++
3 files changed, 64 insertions(+), 22 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
is hard. This patch handles a simple case, a cgroup doesn't
dispatch any IO. We ignore such cgroup's limit, so other cgroups can use
the bandwidth.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)
diff
for their high limit.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 21 +++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 59d4b4c..e2b3704 100644
--- a/block/blk-throttle.c
+++ b/blo
We are going to support high/max limit, each cgroup will have 2 limits
after that. This patch prepares for the multiple limits change.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 109 ---
1 file changed, 68 insertions(
Add high limit for cgroup and corresponding cgroup interface.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 139 +++
1 file changed, 107 insertions(+), 32 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-thro
not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/blk-thro
(by
default 50us for SSD and 1ms for HD). The idea is think time above the
threshold will start to harm performance. HD is much slower so a longer
think time is ok. There is a knob to let user configure the threshold
too.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/bio.c
On Tue, Oct 18, 2016 at 04:10:24PM +0200, Tomasz Majchrzak wrote:
> Once external metadata handler acknowledges all bad blocks (by writing
> to rdev 'bad_blocks' sysfs file), it requests to unblock the array.
> Check if all bad blocks are actually acknowledged as there might be a
> race if new bad
When bandblocks_set acknowledges a range or badblocks_clear a range,
it's possible all badblocks are acknowledged. We should update
unacked_exist if this occurs.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/badblocks.c | 23 +++
1 file changed, 23 insertions(+)
On Sat, Nov 12, 2016 at 09:42:38AM -0800, Christoph Hellwig wrote:
> On Fri, Nov 11, 2016 at 11:02:23AM -0800, Shaohua Li wrote:
> > > It's mostly about the RAID1 and RAID10 code which does a lot of funny
> > > things with the bi_iov_vec and bi_vcnt fields, which we'd pref
On Mon, Nov 14, 2016 at 04:41:33PM -0800, Bart Van Assche wrote:
> On 11/14/2016 04:05 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 02:46:22PM -0800, Bart Van Assche wrote:
> > > On 11/14/2016 02:22 PM, Shaohua Li wrote:
> > > > The background is we don't have
On Mon, Nov 14, 2016 at 02:46:22PM -0800, Bart Van Assche wrote:
> On 11/14/2016 02:22 PM, Shaohua Li wrote:
> > The background is we don't have an ioscheduler for blk-mq yet, so we can't
> > prioritize processes/cgroups. This patch set tries to add basic arbitration
> > bet
On Mon, Nov 14, 2016 at 05:18:28PM -0800, Bart Van Assche wrote:
> On 11/14/2016 04:49 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 04:41:33PM -0800, Bart Van Assche wrote:
> > > Thank you for pointing me to the discussion thread about v3 of this patch
> > >
On Thu, Nov 10, 2016 at 11:46:36AM -0800, Christoph Hellwig wrote:
> Hi Shaohua,
>
> one of the major issues with Ming Lei's multipage biovec works
> is that we can't easily enabled the MD RAID code for it. I had
> a quick chat on that with Chris and Jens and they suggested talking
> to you
On Wed, Nov 23, 2016 at 04:23:35PM -0500, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 14, 2016 at 02:22:16PM -0800, Shaohua Li wrote:
> > cg1/cg2 bps: 10/80 -> 15/105 -> 20/100 -> 25/95 -> 30/90 -> 35/85 -> 40/80
> > -> 45/75 -> 10/80
>
> I wonde
On Tue, Nov 22, 2016 at 04:27:15PM -0500, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 14, 2016 at 02:22:14PM -0800, Shaohua Li wrote:
> > throtl_slice is important for blk-throttling. A lot of stuffes depend on
> > it, for example, throughput measurement. It has 100ms def
On Mon, Nov 28, 2016 at 05:21:48PM -0500, Tejun Heo wrote:
> Hello, Shaohua.
>
> On Wed, Nov 23, 2016 at 05:15:18PM -0800, Shaohua Li wrote:
> > > Hmm... I'm not sure thinktime is the best measure here. Think time is
> > > used by cfq mainly to tell the likely futur
On Mon, Nov 28, 2016 at 05:08:18PM -0500, Tejun Heo wrote:
> Hello, Shaohua.
>
> On Wed, Nov 23, 2016 at 05:06:30PM -0800, Shaohua Li wrote:
> > > Shouldn't this be a per-cgroup setting along with latency target?
> > > These two are the parameters which d
On Tue, Nov 22, 2016 at 03:16:43PM -0500, Tejun Heo wrote:
> On Mon, Nov 14, 2016 at 02:22:10PM -0800, Shaohua Li wrote:
> > each queue will have a state machine. Initially queue is in LIMIT_HIGH
> > state, which means all cgroups will be throttled according to their high
>
On Tue, Nov 22, 2016 at 04:42:00PM -0500, Tejun Heo wrote:
> Hello,
>
> On Tue, Nov 22, 2016 at 04:21:21PM -0500, Tejun Heo wrote:
> > 1. A cgroup and its high and max limits don't have much to do with
> >other cgroups and their limits. I don't get how the choice between
> >high and max
On Tue, Nov 15, 2016 at 11:53:39AM -0800, Bart Van Assche wrote:
> On 11/14/2016 05:28 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 05:18:28PM -0800, Bart Van Assche wrote:
> > > Unless someone can convince me of the opposite I think that coming up with
> > > an a
not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 4
1 file changed, 4 insertions(+)
diff --git a/blo
Add high limit for cgroup and corresponding cgroup interface.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 132 ---
1 file changed, 103 insertions(+), 29 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-thro
for their high limit.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 21 +++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a564215..ec53671 100644
--- a/block/blk-throttle.c
+++ b/blo
-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 191 +-
include/linux/blk_types.h | 2 +
2 files changed, 190 insertions(+), 3 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 01b494d..a05d351 100644
--- a
for SSD as we can't
calcualte the latency target for hard disk. And this is only for cgroup
leaf node so far.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 58 ---
include/linux/blk_types.h | 1 +
2 files changed, 56 inse
.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-sysfs.c| 11
block/blk-throttle.c | 77 +---
block/blk.h | 3 ++
3 files changed, 69 insertions(+), 22 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
ents
http://marc.info/?l=linux-block=147395674732335=2
V1:
http://marc.info/?l=linux-block=146292596425689=2
Shaohua Li (15):
blk-throttle: prepare support multiple limits
blk-throttle: add .high interface
blk-throttle: configure bps/iops limit for cgroup in high limit
blk-throttle: ad
Add interface to configure the threshold
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-sysfs.c| 7 +++
block/blk-throttle.c | 25 +
block/blk.h | 4
3 files changed, 36 insertions(+)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
We are going to support high/max limit, each cgroup will have 2 limits
after that. This patch prepares for the multiple limits change.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 110 ---
1 file changed, 69 insertions(
(by
default 50us for SSD and 1ms for HD). The idea is think time above the
threshold will start to harm performance. HD is much slower so a longer
think time is ok.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/bio.c | 2 ++
block/blk-throttle.c
s
something we pay for sharing.
Note this doesn't completely avoid cgroup running under its high limit.
The best way to guarantee cgroup doesn't run under its limit is to set
max limit. For example, if we set cg1 max limit to 40, cg2 will never
run under its high limit.
Signed-off-by: Shaohua L
When queue state machine is in LIMIT_MAX state, but a cgroup is below
its high limit for some time, the queue should be downgraded to lower
state as one cgroup's high limit isn't met.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision. And the new algorithm can detect completely
idle cgroup too, so we can delete the corresponding code.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.
Add interface for per-cgroup target latency. This latency is for 4k
request.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 67
1 file changed, 67 insertions(+)
diff --git a/block/blk-throttle.c b/block/blk-thro
limit, we can upgrade queue state. The other case is
children has higher high limit than parent. Children's high limit is
meaningless. As long as parent's bps/iops cross high limit, we can
upgrade queue state.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c
.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-core.c | 4 +++-
include/linux/blkdev.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 14d7c07..0a396e9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1763,7 +
This is corresponding part for blk-mq. Disk with multiple hardware
queues doesn't need this as we only hold 1 request at most.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-mq.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/bl
: check the last request instead of the first request, so as long as
there is one big size request we flush the plug.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-core.c | 4 +++-
include/linux/blkdev.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/blo
On Thu, Nov 03, 2016 at 05:09:54PM -0700, Christoph Hellwig wrote:
> On Thu, Nov 03, 2016 at 05:03:54PM -0700, Shaohua Li wrote:
> > This is corresponding part for blk-mq. Disk with multiple hardware
> > queues doesn't need this as we only hold 1 request at most.
>
> A
On Tue, Nov 29, 2016 at 05:54:46PM -0500, Tejun Heo wrote:
> Hello,
>
> On Tue, Nov 29, 2016 at 10:14:03AM -0800, Shaohua Li wrote:
> > What the patches do doesn't conflict what you are talking about. We need a
> > way
> > to detect if cgroups are idle or active.
On Wed, Dec 07, 2016 at 07:50:33PM +0800, Coly Li wrote:
> On 2016/11/30 上午6:45, Avi Kivity wrote:
> > On 11/29/2016 11:14 PM, NeilBrown wrote:
> [snip]
>
> >>> So I disagree that all the work should be pushed to the merging layer.
> >>> It has less information to work with, so the fewer
On Fri, Dec 09, 2016 at 12:44:57AM +0800, Coly Li wrote:
> On 2016/12/8 上午12:59, Shaohua Li wrote:
> > On Wed, Dec 07, 2016 at 07:50:33PM +0800, Coly Li wrote:
> [snip]
> > Thanks for doing this, Coly! For raid0, this totally makes sense. The raid0
> > zones make things a l
For sync direct IO, generic_file_direct_write/generic_file_read_iter
will update file access position. Don't duplicate the update in
.direct_IO. This cause my raid array can't assemble.
Cc: Christoph Hellwig <h...@lst.de>
Cc: Jens Axboe <ax...@fb.com>
Signed-off-by: Shaohua Li
When queue state machine is in LIMIT_MAX state, but a cgroup is below
its low limit for some time, the queue should be downgraded to lower
state as one cgroup's low limit isn't met.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c
We are going to support low/max limit, each cgroup will have 2 limits
after that. This patch prepares for the multiple limits change.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 114 +--
1 file changed, 73 insertions(
is hard. This patch handles a simple case, a cgroup doesn't
dispatch any IO. We ignore such cgroup's limit, so other cgroups can use
the bandwidth.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff
'throttle_sample_time' reflects its character better.
Signed-off-by: Shaohua Li <s...@fb.com>
---
Documentation/block/queue-sysfs.txt | 6 +++
block/blk-sysfs.c | 10 +
block/blk-throttle.c| 77 ++---
block/blk.h
50us for SSD and 1ms for HD). The idea is think time above the
threshold will start to harm performance. HD is much slower so a longer
think time is ok.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/bio.c | 2 ++
block/blk-throttle.c
clean up the code to avoid using -1
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 32
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a6bb4fe..e45bf50 100644
--- a/blo
Add low limit for cgroup and corresponding cgroup interface.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 134 ---
1 file changed, 106 insertions(+), 28 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-thro
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision. And the new algorithm can detect completely
idle cgroup too, so we can delete the corresponding code.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.
Currently there is no way to know the request size when the request is
finished. Next patch will need this info, so add to blk_issue_stat. With
this, we will have 49bits to track time, which still is very long time.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-core.c
not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 4
1 file changed, 4 insertions(+)
diff --git a/blo
this feature is SSD only, we probably
can use a fixed threshold like 4ms for hard disk though.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-stat.c | 4 ++
block/blk-throttle.c | 159 +-
block/blk.h | 2 +
include
the interface in this way:
echo "8:16 rbps=2097152 wbps=max latency=100 idle=200" > io.low
latency is in microsecond unit
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 30 ++
1 file changed, 26 insertions(+), 4 deletions(-)
dif
configured by user.
For low limit, cgroup will use the minimal between low limit and max
limit configured by user. Last patch already did the convertion.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/block/blk-thrott
Add interface to configure the threshold. The io.low interface will
like:
echo "8:16 rbps=2097152 wbps=max idle=2000" > io.low
idle is in microsecond unit.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 27 ---
1 file changed,
idle and
other cgroups can dispatch more IO.
Currently this latency target check is only for SSD as we can't
calcualte the latency target for hard disk. And this is only for cgroup
leaf node so far.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.
On Mon, Jan 09, 2017 at 04:46:35PM -0500, Tejun Heo wrote:
> Hello,
>
> Sorry about the long delay. Generally looks good to me. Overall,
> there are only a few things that I think should be addressed.
Thanks for your time!
> * Low limit should default to zero.
I forgot to change it after
On Tue, Nov 29, 2016 at 12:31:08PM -0500, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 14, 2016 at 02:22:22PM -0800, Shaohua Li wrote:
> > One hard problem adding .high limit is to detect idle cgroup. If one
> > cgroup doesn't dispatch enough IO against its high limit, we must
On Tue, Nov 29, 2016 at 12:24:35PM -0500, Tejun Heo wrote:
> Hello, Shaohua.
>
> On Mon, Nov 14, 2016 at 02:22:20PM -0800, Shaohua Li wrote:
> > To do this, we sample some data, eg, average latency for request size
> > 4k, 8k, 16k, 32k, 64k. We then use an equation f(
The throtl_slice is 100ms by default. This is a long time for SSD, a lot
of IO can run. To make cgroups have smoother throughput, we choose a
small value (20ms) for SSD.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-sysfs.c| 2 ++
block/blk-throttle.c | 18 +++---
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision. And the new algorithm can detect completely
idle cgroup too, so we can delete the corresponding code.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.
this feature is SSD only, we probably
can use a fixed threshold like 4ms for hard disk though.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-stat.c | 4 ++
block/blk-throttle.c | 162 --
block/blk.h | 2 +
include
ttp://marc.info/?l=linux-block=147395674732335=2
V1:
http://marc.info/?l=linux-block=146292596425689=2
Shaohua Li (18):
blk-throttle: use U64_MAX/UINT_MAX to replace -1
blk-throttle: prepare support multiple limits
blk-throttle: add .low interface
blk-throttle: configure bps/iops limit fo
Add interface to configure the threshold. The io.low interface will
like:
echo "8:16 rbps=2097152 wbps=max idle=2000" > io.low
idle is in microsecond unit.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 41 -
.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 2d05c91..b3ce176 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -149,6
as parent's bps/iops (which is a sum of childrens
bps/iops) cross low limit, we can upgrade queue state.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 100 ---
1 file changed, 96 insertions(+), 4 deletions(-)
diff --git a/blo
-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 20 ++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index d3ad43c..3bc6deb 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -212,12 +212,28 @@
configuration. Old bps/iops fields in throtl_grp will be the
actual limit we use for throttling.
Signed-off-by: Shaohua Li <s...@fb.com>
---
block/blk-throttle.c | 142 +--
1 file changed, 114 insertions(+), 28 deletions(-)
diff --git a/block/blk-thro
then
fully downgrade the queue to LIMIT_LOW state.
Note this doesn't completely avoid cgroup running under its low limit.
The best way to guarantee cgroup doesn't run under its limit is to set
max limit. For example, if we set cg1 max limit to 40, cg2 will never
run under its low limit.
Signed-o
Jens,
can you look at this patch? If it's ok, I'd like to route it through md tree.
Thanks,
Shaohua
On Fri, Mar 17, 2017 at 12:12:29AM +0800, Ming Lei wrote:
> Turns out we can use bio_copy_data in raid1's write behind,
> and we can make alloc_behind_pages() more clean/efficient,
> but we need
On Fri, Mar 24, 2017 at 04:57:37PM +1100, Neil Brown wrote:
> On Fri, Mar 17 2017, Ming Lei wrote:
>
> > Both raid1 and raid10 share common resync
> > block size and page count, so move them into md.h.
>
> I don't think this is necessary.
> These are just "magic" numbers. They don't have any
1 - 100 of 327 matches
Mail list logo