On Tue, 8 Dec 2020 at 23:39, Richard Henderson <richard.hender...@linaro.org> wrote: > > On 12/8/20 12:15 PM, Peter Maydell wrote: > > +static inline uint64_t clock_ticks_to_ns(const Clock *clk, uint64_t ticks) > > +{ > > + uint64_t ns_low, ns_high; > > + > > + /* > > + * clk->period is the period in units of 2^-32 ns, so > > + * (clk->period * ticks) is the required length of time in those > > + * units, and we can convert to nanoseconds by multiplying by > > + * 2^32, which is the same as shifting the 128-bit multiplication > > + * result right by 32. > > + */ > > + mulu64(&ns_low, &ns_high, clk->period, ticks); > > + return ns_low >> 32 | ns_high << 32; > > With the shift, you're discarding the high 32 bits of the result. You'll lose > those same bits if you shift one of the inputs left by 32, and use only the > high part of the result, e.g. > > mulu(&discard, &ret, clk->period, ticks << 32); > return ret; > > Which on some hosts, e.g. aarch64, only requires umulh and not two multiply > instructions.
We can't do this if we want to allow a full 64-bit 'ticks' input, right? > Either way, I wonder if you want to either use uint32_t ticks, or assert that > ticks <= UINT32_MAX? Or if you don't shift ticks, assert that ns_high <= > UINT32_MAX, so that you don't lose output bits? So I think my plan for v2 of this series is just to add in the saturation-to-INT64_MAX logic. thanks -- PMM