> -----Original Message-----
> From: Eelco Chaudron <[email protected]>
> Sent: Friday 25 November 2022 12:26
> To: Finn, Emma <[email protected]>
> Cc: [email protected]; Van Haaren, Harry
> <[email protected]>
> Subject: Re: [v4] odp-execute: Add ISA implementation of set_masked IPv6
> action
>
>
>
> On 24 Nov 2022, at 10:30, Emma Finn wrote:
>
> > This commit adds support for the AVX512 implementation of the
> > ipv6_set_addrs action as well as an AVX512 implementation of updating
> > the L4 checksums.
> >
> > Signed-off-by: Emma Finn <[email protected]>
>
> Thanks Emma for the v4, I have one question and a couple of style issues. To
> speed things up I just provide the diff for the style issues.
>
> I was not able to do any actual testing, as my system did not have the
> avx512vbmi extension :(
>
> Cheers,
>
> Eelco
>
> > ---
>
> Style issues diff:
>
> diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c index
> 82ff7e647..f798d6708 100644
> --- a/lib/odp-execute-avx512.c
> +++ b/lib/odp-execute-avx512.c
> @@ -20,9 +20,9 @@
>
> #include <config.h>
> #include <errno.h>
> -#include <sys/types.h>
> #include <netinet/in.h>
> #include <netinet/ip6.h>
> +#include <sys/types.h>
>
For FreeBSD, network headers must be included in a certain order.
See this commit for details:
b2befd5bb2db ("sparse: Add guards to prevent FreeBSD-incompatible #include
order.")
So these need to stay in this order.
> #include "csum.h"
> #include "dp-packet.h"
> @@ -547,8 +547,8 @@ avx512_ipv6_sum_header(__m512i ip6_header)
> * horizontal add. */
> __m256i v_swap32a = _mm256_setr_epi32(0x0, 0x4, 0xF, 0xF,
> 0xF, 0xF, 0xF, 0xF);
> - v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
>
> + v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
> v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
> v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
>
> @@ -562,7 +562,7 @@ avx512_ipv6_addr_csum_delta(__m512i old_header,
> __m512i new_header) {
> uint16_t old_delta = avx512_ipv6_sum_header(old_header);
> uint16_t new_delta = avx512_ipv6_sum_header(new_header);
> - uint32_t csum_delta = (uint16_t)~old_delta + new_delta;
> + uint32_t csum_delta = ((uint16_t) ~old_delta) + new_delta;
>
> return ~csum_finish(csum_delta);
> }
> @@ -606,15 +606,15 @@ action_avx512_ipv6_set_addrs(struct
> dp_packet_batch *batch,
> __m512i v_key_shuf = _mm512_permutexvar_epi8(v_shuffle, v_key);
> __m512i v_mask_shuf = _mm512_permutexvar_epi8(v_shuffle, v_mask);
>
> -
> /* Set the v_zero register to all zero's. */
> const __m128i v_zeros = _mm_setzero_si128();
> +
> /* Set the v_all_ones register to all one's. */
> const __m128i v_all_ones = _mm_cmpeq_epi16(v_zeros, v_zeros);
>
> - /* Load ip6 src and dst respectively into 128-bit wide registers. */
> + /* Load ip6 src and dst masks respectively into 128-bit wide
> + registers. */
> __m128i v_src = _mm_loadu_si128((void *) mask);
> - __m128i v_dst = _mm_maskz_loadu_epi64(0xC,(void *) mask);
> + __m128i v_dst = _mm_maskz_loadu_epi64(0xC, (void *) mask);
>
> /* Perform a bitwise OR between src and dst registers. */
> __m128i v_or = _mm_or_si128(v_src, v_dst);
>
> > v4:
> > - Reworked and moved check for checksum outside loop.
> > - Code cleanup based on review from Eelco.
> > v3:
> > - Added a runtime check for AVX512 vbmi.
> > v2:
> > - Added check for availbility of s6_addr32 field of struct in6_addr.
> > - Fixed network headers for freebsd builds.
> >
>
> <SNIP>
>
> > + /* Load ip6 src and dst respectively into 128-bit wide registers. */
> > + __m128i v_src = _mm_loadu_si128((void *) mask);
> > + __m128i v_dst = _mm_maskz_loadu_epi64(0xC,(void *) mask);
>
> Guess it might be me, but I do not understand how
> _mm_maskz_loadu_epi64() will load the dst from the mask.
> Looking at the intrinsics guide it will only read the first two 64-bit
> values, but
> mask points to src?
>
> Should we not just do the following here?
>
> + __m128i v_src = _mm_loadu_si128((void *) &mask->ipv6_src);
> + __m128i v_dst = _mm_loadu_si128((void *) &mask->ipv6_dst);
>
Yes, good catch. The _maskz_load_ isn't correctly pointing to the dst values
from the mask.
I will make all the changes above (except the include header comment) and send
v5 shortly.
<SNIP>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev