On 28.06.2016 17:28, Sascha Silbe wrote: > ratelimit_calculate_delay() previously reset the accounting every time > slice, no matter how much data had been processed before. This had (at > least) two consequences: > > 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream. > > Not sure if there are real-world use cases where this would be a > problem. Mirroring and backup over a slow link (e.g. DSL) would > come to mind, though. > > 2. Tests for block job operations (e.g. cancel) were rather racy > > All block jobs currently use a time slice of 100ms. That's a > reasonable value to get smooth output during regular > operation. However this also meant that the state of block jobs > changed every 100ms, no matter how low the configured limit was. On > busy hosts, qemu often transferred additional chunks until the test > case had a chance to cancel the job. > > Fix the block job rate limit code to delay for more than one time > slice to address the above issues. To make it easier to handle > oversized chunks we switch the semantics from returning a delay > _before_ the current request to a delay _after_ the current > request. If necessary, this delay consists of multiple time slice > units. > > Since the mirror job sends multiple chunks in one go even if the rate > limit was exceeded in between, we need to keep track of the start of > the current time slice so we can correctly re-compute the delay for > the updated amount of data. > > The minimum bandwidth now is 1 data unit per time slice. The block > jobs are currently passing the amount of data transferred in sectors > and using 100ms time slices, so this translates to 5120 > bytes/second. With chunk sizes usually being O(512KiB), tests have > plenty of time (O(100s)) to operate on block jobs. The chance of a > race condition now is fairly remote, except possibly on insanely > loaded systems. > > Signed-off-by: Sascha Silbe <[email protected]> > --- > block/commit.c | 13 +++++-------- > block/mirror.c | 4 +++- > block/stream.c | 12 ++++-------- > include/qemu/ratelimit.h | 43 ++++++++++++++++++++++++++++++++++--------- > 4 files changed, 46 insertions(+), 26 deletions(-) >
[...]
> diff --git a/block/mirror.c b/block/mirror.c
> index a04ed9c..d9d70f0 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -416,7 +416,9 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> assert(io_sectors);
> sector_num += io_sectors;
> nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
> - delay_ns += ratelimit_calculate_delay(&s->limit, io_sectors);
> + if (s->common.speed) {
> + delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors);
> + }
Hmm... Was it a bug that ratelimit_calculate_delay() was called
unconditionally before?
> }
> return delay_ns;
> }
[...]
> diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
> index d413a4a..12db769 100644
> --- a/include/qemu/ratelimit.h
> +++ b/include/qemu/ratelimit.h
> @@ -15,34 +15,59 @@
> #define QEMU_RATELIMIT_H 1
>
> typedef struct {
> - int64_t next_slice_time;
> + int64_t slice_start_time;
> + int64_t slice_end_time;
> uint64_t slice_quota;
> uint64_t slice_ns;
> uint64_t dispatched;
> } RateLimit;
>
> +/** Calculate and return delay for next request in ns
> + *
> + * Record that we sent @p n data units. If we may send more data units
> + * in the current time slice, return 0 (i.e. no delay). Otherwise
> + * return the amount of time (in ns) until the start of the next time
> + * slice that will permit sending the next chunk of data.
> + *
> + * Recording sent data units even after exceeding the quota is
> + * permitted; the time slice will be extended accordingly.
> + */
> static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n)
> {
> int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> + uint64_t delay_slices;
>
> - if (limit->next_slice_time < now) {
> - limit->next_slice_time = now + limit->slice_ns;
> + assert(limit->slice_quota && limit->slice_ns);
> +
> + if (limit->slice_end_time < now) {
> + /* Previous, possibly extended, time slice finished; reset the
> + * accounting. */
> + limit->slice_start_time = now;
> + limit->slice_end_time = now + limit->slice_ns;
> limit->dispatched = 0;
> }
> - if (limit->dispatched == 0 || limit->dispatched + n <=
> limit->slice_quota) {
> - limit->dispatched += n;
> +
> + limit->dispatched += n;
> + if (limit->dispatched < limit->slice_quota) {
Nitpick: This should probably stay <=.
> + /* We may send further data within the current time slice, no
> + * need to delay the next request. */
> return 0;
> - } else {
> - limit->dispatched = n;
> - return limit->next_slice_time - now;
> }
> +
> + /* Quota exceeded. Calculate the next time slice we may start
> + * sending data again. */
> + delay_slices = (limit->dispatched + limit->slice_quota - 1) /
> + limit->slice_quota;
> + limit->slice_end_time = limit->slice_start_time +
> + delay_slices * limit->slice_ns;
I think it would make sense to make this a floating point calculation.
If you don't agree, though:
Reviewed-by: Max Reitz <[email protected]>
> + return limit->slice_end_time - now;
> }
>
> static inline void ratelimit_set_speed(RateLimit *limit, uint64_t speed,
> uint64_t slice_ns)
> {
> limit->slice_ns = slice_ns;
> - limit->slice_quota = ((double)speed * slice_ns)/1000000000ULL;
> + limit->slice_quota = MAX(((double)speed * slice_ns) / 1000000000ULL, 1);
> }
>
> #endif
>
signature.asc
Description: OpenPGP digital signature
