> On 15 Jan 2017, at 01:03, Josef Bacik wrote:
>
> Yeah I noticed this in testing, there's a bug with NBD since it's
> inception where it uses a 32 bit number for keeping track of the size
> of the device, I fixed it with
>
> nbd: use loff_t for blocksize and nbd_set_size args
When queue is in LIMIT_LOW state and all cgroups with low limit cross
the bps/iops limitation, we will upgrade queue's state to
LIMIT_MAX. To determine if a cgroup exceeds its limitation, we check if
the cgroup has pending request. Since cgroup is throttled according to
the limit, pending request
each queue will have a state machine. Initially queue is in LIMIT_LOW
state, which means all cgroups will be throttled according to their low
limit. After all cgroups with low limit cross the limit, the queue state
gets upgraded to LIMIT_MAX state.
For max limit, cgroup will use the limit
Add low limit for cgroup and corresponding cgroup interface. To be
consistent with memcg, we allow users configure .low limit higher than
.max limit. But the internal logic always assumes .low limit is lower
than .max limit. So we add extra bps/iops_conf fields in throtl_grp for
userspace
Hi,
cgroup still lacks a good iocontroller. CFQ works well for hard disk, but not
much for SSD. This patch set try to add a conservative limit for blk-throttle.
It isn't a proportional scheduling, but can help prioritize cgroups. There are
several advantages we choose blk-throttle:
- blk-throttle
Add interface to configure the threshold. The io.low interface will
like:
echo "8:16 rbps=2097152 wbps=max idle=2000" > io.low
idle is in microsecond unit.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 41 -
1 file changed, 28
cgroup could be assigned a limit, but doesn't dispatch enough IO, eg the
cgroup is idle. When this happens, the cgroup doesn't hit its limit, so
we can't move the state machine to higher level and all cgroups will be
throttled to their lower limit, so we waste bandwidth. Detecting idle
cgroup is
When cgroups all reach low limit, cgroups can dispatch more IO. This
could make some cgroups dispatch more IO but others not, and even some
cgroups could dispatch less IO than their low limit. For example, cg1
low limit 10MB/s, cg2 limit 80MB/s, assume disk maximum bandwidth is
120M/s for the
The throtl_slice is 100ms by default. This is a long time for SSD, a lot
of IO can run. To make cgroups have smoother throughput, we choose a
small value (20ms) for SSD.
Signed-off-by: Shaohua Li
---
block/blk-sysfs.c| 2 ++
block/blk-throttle.c | 18 +++---
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision. And the new algorithm can detect completely
idle cgroup too, so we can delete the corresponding code.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 40
User configures latency target, but the latency threshold for each
request size isn't fixed. For a SSD, the IO latency highly depends on
request size. To calculate latency threshold, we sample some data, eg,
average latency for request size 4k, 8k, 16k, 32k .. 1M. The latency
threshold of each
On Sat, 2017-01-14 at 20:03 -0500, Josef Bacik wrote:
> nbd: use loff_t for blocksize and nbd_set_size args
I'm just trying your patch, which does however not apply to 4.9.2 (but
I've adapted it)... will tell later if it worked for me.
Cheers,
Chris.
smime.p7s
Description: S/MIME cryptographic
On Sat, Jan 14, 2017 at 4:10 PM, Sagi Grimberg wrote:
Hey Josef,
To prepare for dynamically adding new nbd devices to the system
switch
from using an array for the nbd devices and instead use an idr.
This
copies what loop does for keeping track of its devices.
I think
On Sat, Jan 14, 2017 at 6:31 PM, Christoph Anton Mitterer
wrote:
Hi.
On advice from Alex Bligh I'd like to ping linux-block and nbd-general
about the issue described here:
https://github.com/NetworkBlockDevice/nbd/issues/44
What basically happens is, that with a recent
Hi Linus,
Here's a set of fixes for the current series. This pull request
contains:
- The virtio_blk stack DMA corruption fix from Christoph, fixing and
issue with VMAP stacks.
- O_DIRECT blkbits calculation fix from Chandan.
- Discard regression fix from Christoph.
- Queue init error
Hi.
On advice from Alex Bligh I'd like to ping linux-block and nbd-general
about the issue described here:
https://github.com/NetworkBlockDevice/nbd/issues/44
What basically happens is, that with a recent kernel (Linux heisenberg
4.9.0-1-amd64 #1 SMP Debian 4.9.2-2 (2017-01-12) x86_64
> On Jan 14, 2017, at 4:15 PM, Sagi Grimberg wrote:
>
>
>>> Hey Josef,
>>>
Since we are in the memory reclaim path we need our recv work to be on a
workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks. Also
set WQ_HIGHPRI since we are in the
Hey Josef,
Since we are in the memory reclaim path we need our recv work to be on a
workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks. Also
set WQ_HIGHPRI since we are in the completion path for IO.
Really a workqueue per device?? Did this really give performance
advantage?
Hey Josef,
To prepare for dynamically adding new nbd devices to the system switch
from using an array for the nbd devices and instead use an idr. This
copies what loop does for keeping track of its devices.
I think ida_simple_* is simpler and sufficient here isn't it?
I use more of the
19 matches
Mail list logo