On Wed, 8 Feb 2017, Andy Lutomirski wrote:
> > +#ifdef CONFIG_HYPERV_CLOCK
> > +/* (a * b) >> 64 implementation */
> > +static inline u64 mul64x64_hi(u64 a, u64 b)
> > +{
> > +       u64 a_lo, a_hi, b_lo, b_hi, p1, p2;
> > +
> > +       a_lo = (u32)a;
> > +       a_hi = a >> 32;
> > +       b_lo = (u32)b;
> > +       b_hi = b >> 32;
> > +       p1 = a_lo * b_hi;
> > +       p2 = a_hi * b_lo;
> > +
> > +       return a_hi * b_hi + (p1 >> 32) + (p2 >> 32) +
> > +               ((((a_lo * b_lo) >> 32) + (u32)p1 + (u32)p2) >> 32);
> > +
> > +}
> 
> Unless GCC is waaay more clever than I think, this is hugely
> suboptimal on 64-bit.  x86 can do this in a single instruction, and
> gcc can express it cleanly using __uint128_t.  I wouldn't be terribly
> surprised if the 32-bit generated code was fine, too.

We already have that: mul_u64_u64_shr() which can be replaced by an arch
specific implementation. Nobody bothered to do that for x86 yet, but we
definitely don't want to open code it another time

Thanks,

        tglx
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to