> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Friday, 1 July 2022 19.05
> 
> On Fri, 1 Jul 2022 18:50:34 +0200
> Morten Brørup <m...@smartsharesystems.com> wrote:
> 
> > But I guess it is something else.
> >
> > Anyway, this function has ugly alignment problems (also before the
> patch), and has gone through a couple of iterations to silence warnings
> from the compiler. These warnings should have been addressed instead of
> silenced. Mattias has suggested a far better solution [2] than mine,
> which also correctly addresses the compiler alignment warnings, so we
> will probably end up with his solution instead.
> >
> > [2]
> http://inbox.dpdk.org/dev/AM8PR07MB7666AD7BF7B780CC5062C14598BD9@AM8PR0
> 7MB7666.eurprd07.prod.outlook.com/T/#m1a76490541fce4a85b12d9390f2f4fac5
> a9f4660
> >
> 
> 
> Maybe some mix of the memcpy for unaligned and odd length and faster
> (unrolled?) version for the case of aligned and exact multiple?
> Or just take code from FreeBSD?

I just took a look at the BSD code, and it starts with the same "if ptr is 
unaligned" as my patch, and then does some manual loop unrolling, which we 
expect the compiler to do. Mattias has demonstrated that his solution has 
better performance, not only on modern X86 CPUs, also on an A72 CPU, so I 
prefer his solution. And the difference between using "vmovdqa" and "vmovdq" 
instructions here seem to be insignificant.

Reply via email to