From: Al Viro <v...@ftp.linux.org.uk> On Behalf Of Al Viro
> Sent: 16 April 2021 20:44
> On Fri, Apr 16, 2021 at 12:24:13PM -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <eduma...@google.com>
> >
> > We have to loop only to copy u64 values.
> > After this first loop, we copy at most one u32, one u16 and one byte.
> 
> Does it actually yield a better code?
> 
> FWIW, this
> void bar(unsigned);
> void foo(unsigned n)
> {
>       while (n >= 8) {
>               bar(n);
>               n -= 8;
>       }
>       while (n >= 4) {
>               bar(n);
>               n -= 4;
>       }
>       while (n >= 2) {
>               bar(n);
>               n -= 2;
>       }
>       while (n >= 1) {
>               bar(n);
>               n -= 1;
>       }
> }

This variant might be better:

void foo(unsigned n)
{
        while (n >= 8) {
                bar(8);
                n -= 8;
        }
        if (likely(!n))
                return;
        if (n & 4)
                bar(4);
        if (n & 2)
                bar(2);
        if (n & 1)
                bar(1);
}

I think Al's version might have optimised down to this,
but Eric's asm contains the n -= 4/2/1;

OTOH gcc can make a real pig's breakfast of code like this!

        David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Reply via email to