throtl_slice is important for blk-throttling. A lot of stuffes depend on
it, for example, throughput measurement. It has 100ms default value,
which is not appropriate for all disks. For example, for SSD we might
use a smaller value to make the throughput smoother. This patch makes it
tunable.
cgroup could be assigned a limit, but doesn't dispatch enough IO, eg the
cgroup is idle. When this happens, the cgroup doesn't hit its limit, so
we can't move the state machine to higher level and all cgroups will be
throttled to thier lower limit, so we waste bandwidth. Detecting idle
cgroup is
each queue will have a state machine. Initially queue is in LIMIT_HIGH
state, which means all cgroups will be throttled according to their high
limit. After all cgroups with high limit cross the limit, the queue state
gets upgraded to LIMIT_MAX state.
cgroups without high limit will use max limit
cgroup could be throttled to a limit but when all cgroups cross high
limit, queue enters a higher state and so the group should be throttled
to a higher limit. It's possible the cgroup is sleeping because of
throttle and other cgroups don't dispatch IO any more. In this case,
nobody can trigger
A cgroup gets assigned a high limit, but the cgroup could never dispatch
enough IO to cross the high limit. In such case, the queue state machine
will remain in LIMIT_HIGH state and all other cgroups will be throttled
according to high limit. This is unfair for other cgroups. We should
treat the
We are going to support high/max limit, each cgroup will have 2 limits
after that. This patch prepares for the multiple limits change.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 109 ---
1 file changed, 68 insertions(+), 41
Add high limit for cgroup and corresponding cgroup interface.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 139 +++
1 file changed, 107 insertions(+), 32 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
Alex,
Christoph,
On Mon, Oct 03, 2016 at 12:34:33PM +0100, Alex Bligh wrote:
> On 3 Oct 2016, at 08:57, Christoph Hellwig wrote:
> >> Can you clarify what you mean by that? Why is it an "odd flush
> >> definition", and how would you "properly" define it?
> >
> > E.g. take
Hi,
While trying to do a discard (via blkdiscard --length 1048576
/dev/) to an LVM device atop a two disk md RAID1 the
following oops was generated:
[ 103.306243] md: resync of RAID array md127
[ 103.306246] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 103.306248] md: using maximum
On 10/03/2016 07:34 AM, Alex Bligh wrote:
On 3 Oct 2016, at 08:57, Christoph Hellwig wrote:
Can you clarify what you mean by that? Why is it an "odd flush
definition", and how would you "properly" define it?
E.g. take the defintion from NVMe which also supports
> On 3 Oct 2016, at 08:57, Christoph Hellwig wrote:
>
>> Can you clarify what you mean by that? Why is it an "odd flush
>> definition", and how would you "properly" define it?
>
> E.g. take the defintion from NVMe which also supports multiple queues:
>
> "The Flush command
On Mon, Oct 03, 2016 at 09:51:49AM +0200, Wouter Verhelst wrote:
> Actually, I was pointing out the TCP head-of-line issue, where a delay
> on the socket that contains the flush reply would result in the arrival
> in the kernel block layer of a write reply before the said flush reply,
> resulting
On Mon, Oct 03, 2016 at 12:20:49AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 03, 2016 at 01:47:06AM +, Josef Bacik wrote:
> > It's not "broken", it's working as designed, and any fs on top of this
> > patch will be perfectly safe because they all wait for their io to complete
> > before
On Sun, Oct 02, 2016 at 05:17:14PM +0100, Alex Bligh wrote:
> On 29 Sep 2016, at 17:59, Josef Bacik wrote:
> > Huh I missed that. Yeah that's not possible for us for sure, I think my
> > option
> > idea is the less awful way forward if we want to address that limitation.
> >
14 matches
Mail list logo