>  static void mlx4_bf_copy(unsigned long *dst, unsigned long *src,
unsigned bytecnt) {
> +     int i;
> +     __le32 *psrc = (__le32 *)src;
> +
> +     /*
> +      * the buffer is already in big endian. For little endian
machines that's
> +      * fine. For big endain machines we must swap since the chipset
swaps again
> +      */
> +     for (i = 0; i < bytecnt / 4; ++i)
> +             psrc[i] = le32_to_cpu(psrc[i]);
> +
>       __iowrite64_copy(dst, src, bytecnt / 8);
>  }

That code looks horrid...
1) I'm not sure the caller expects the buffer to be corrupted.
2) It contains a lot of memory cycles.
3) It looked from the calls that this code is copying descriptors,
   so the transfer length is probably 1 or 2 words - so the loop
   is inefficient.
4) ppc doesn't have a fast byteswap instruction (very new gcc might
   use the byteswapping memery access for the le32_to_cpu() though),
   so it would be better getting the byteswap done inside
   __iowrite64_copy() - since that is probably requesting a byteswap
   anyway.
OTOH I'm not at all clear about the 64bit xfers....


_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to