On 12/9/20 2:49 AM, Luc Michel wrote: > On 12/9/20 12:39 AM, Richard Henderson wrote: >> On 12/8/20 12:15 PM, Peter Maydell wrote: >>> The clock_get_ns() API claims to return the period of a clock in >>> nanoseconds. Unfortunately since it returns an integer and a >>> clock's period is represented in units of 2^-32 nanoseconds, >>> the result is often an approximation, and calculating a clock >>> expiry deadline by multiplying clock_get_ns() by a number-of-ticks >>> is unacceptably inaccurate. >>> >>> Introduce a new API clock_ticks_to_ns() which returns the number >>> of nanoseconds it takes the clock to make a given number of ticks. >>> This function can do the complete calculation internally and >>> will thus give a more accurate result. >>> >>> Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> >>> --- >>> The 64x64->128 multiply is a bit painful for 32-bit and I >>> guess in theory since we know we only want bits [95:32] >>> of the result we could special-case it, but TBH I don't >>> think 32-bit hosts merit much optimization effort these days. >>> --- >>> docs/devel/clocks.rst | 15 +++++++++++++++ >>> include/hw/clock.h | 29 +++++++++++++++++++++++++++++ >>> 2 files changed, 44 insertions(+) >>> >>> diff --git a/docs/devel/clocks.rst b/docs/devel/clocks.rst >>> index e5da28e2111..aebeedbb95e 100644 >>> --- a/docs/devel/clocks.rst >>> +++ b/docs/devel/clocks.rst >>> @@ -258,6 +258,21 @@ Here is an example: >>> clock_get_ns(dev->my_clk_input)); >>> } >>> +Calculating expiry deadlines >>> +---------------------------- >>> + >>> +A commonly required operation for a clock is to calculate how long >>> +it will take for the clock to tick N times; this can then be used >>> +to set a timer expiry deadline. Use the function ``clock_ticks_to_ns()``, >>> +which takes an unsigned 64-bit count of ticks and returns the length >>> +of time in nanoseconds required for the clock to tick that many times. >>> + >>> +It is important not to try to calculate expiry deadlines using a >>> +shortcut like multiplying a "period of clock in nanoseconds" value >>> +by the tick count, because clocks can have periods which are not a >>> +whole number of nanoseconds, and the accumulated error in the >>> +multiplication can be significant. >>> + >>> Changing a clock period >>> ----------------------- >>> diff --git a/include/hw/clock.h b/include/hw/clock.h >>> index 81bcf3e505a..a9425d9fb14 100644 >>> --- a/include/hw/clock.h >>> +++ b/include/hw/clock.h >>> @@ -16,6 +16,7 @@ >>> #include "qom/object.h" >>> #include "qemu/queue.h" >>> +#include "qemu/host-utils.h" >>> #define TYPE_CLOCK "clock" >>> OBJECT_DECLARE_SIMPLE_TYPE(Clock, CLOCK) >>> @@ -218,6 +219,34 @@ static inline unsigned clock_get_ns(Clock *clk) >>> return CLOCK_PERIOD_TO_NS(clock_get(clk)); >>> } >>> +/** >>> + * clock_ticks_to_ns: >>> + * @clk: the clock to query >>> + * @ticks: number of ticks >>> + * >>> + * Returns the length of time in nanoseconds for this clock >>> + * to tick @ticks times. Because a clock can have a period >>> + * which is not a whole number of nanoseconds, it is important >>> + * to use this function when calculating things like timer >>> + * expiry deadlines, rather than attempting to obtain a "period >>> + * in nanoseconds" value and then multiplying that by a number >>> + * of ticks. >>> + */ >>> +static inline uint64_t clock_ticks_to_ns(const Clock *clk, uint64_t ticks) >>> +{ >>> + uint64_t ns_low, ns_high; >>> + >>> + /* >>> + * clk->period is the period in units of 2^-32 ns, so >>> + * (clk->period * ticks) is the required length of time in those >>> + * units, and we can convert to nanoseconds by multiplying by >>> + * 2^32, which is the same as shifting the 128-bit multiplication >>> + * result right by 32. >>> + */ >>> + mulu64(&ns_low, &ns_high, clk->period, ticks); >>> + return ns_low >> 32 | ns_high << 32; >> >> With the shift, you're discarding the high 32 bits of the result. You'll >> lose >> those same bits if you shift one of the inputs left by 32, and use only the >> high part of the result, e.g. >> >> mulu(&discard, &ret, clk->period, ticks << 32); >> return ret; >> >> Which on some hosts, e.g. aarch64, only requires umulh and not two multiply >> instructions. >> >> Either way, I wonder if you want to either use uint32_t ticks, or assert that >> ticks <= UINT32_MAX? Or if you don't shift ticks, assert that ns_high <= >> UINT32_MAX, so that you don't lose output bits? > > If I'm not mistaken, loosing bits in the 32 bits upper part would mean that > the > number of ticks correspond to a period greater or equal to: > 2^96 ns ~= 251230855258 years.
No, would be the bit above above ns_high (this integer 64x64->128 multiplication is computing fixed point 64.0 x 32.32 -> 96.32). This function is truncating back to 64.0, dropping the 32 high bits and 32 low bits. We lose bits at 2^64 ns ~= 584 years. Which is still unreasonably long, but could still be had from a timer setting ~= never. An alternate to an assert could be saturation. Input "infinity", return "infinity". More or less. r~