From: Joe Perches
> Sent: 20 December 2017 17:20
...
> > I think this version works.
> > It doesn't have the optimisation for small values.
> >
> > unsigned int sqrt64(unsigned long long x)
> > {
> >         unsigned int x_hi = x >> 32;
> >
> >         unsigned int b = 0;
> >         unsigned int y = 0;
> >         unsigned int i;
> >
> 
> Perhaps add:
> 
>       if (x <= UINT_MAX)
>               return int_sqrt((unsigned long)x);

Actually something like:
        i = 32;
        if (!x_hi) {
                x_hi = x;
                i = 16;
        }
        
        if (!(x_hi & 0xffff0000)) {
                x_hi <<= 16;
                i -= 8;
        }
Repeat for 0xff000000, 0xf0000000 and 0xc0000000 and adjust loop to count down.

        David

> 
> >         for (i = 0; i < 32; i++) {
> >                 b <<= 2;
> >                 b |= x_hi >> 30;
> >                 x_hi <<= 2;
> >                 if (i == 15)
> >                         x_hi = x;
> >                 y <<= 1;
> >                 if (b > y)
> >                         b -= ++y;
> >         }
> >         return y;
> > }
> >
> > Put it through cc -O3 -m32 -c -o sqrt64.o sqrt64.c and then objdump sqrt64.o
> > and compare to that of your version.
> >
> >     David
> >

Reply via email to