> -----Original Message-----
> From: Eelco Chaudron <[email protected]>
> Sent: Friday 25 November 2022 12:26
> To: Finn, Emma <[email protected]>
> Cc: [email protected]; Van Haaren, Harry
> <[email protected]>
> Subject: Re: [v4] odp-execute: Add ISA implementation of set_masked IPv6
> action
> 
> 
> 
> On 24 Nov 2022, at 10:30, Emma Finn wrote:
> 
> > This commit adds support for the AVX512 implementation of the
> > ipv6_set_addrs action as well as an AVX512 implementation of updating
> > the L4 checksums.
> >
> > Signed-off-by: Emma Finn <[email protected]>
> 
> Thanks Emma for the v4, I have one question and a couple of style issues. To
> speed things up I just provide the diff for the style issues.
> 
> I was not able to do any actual testing, as my system did not have the
> avx512vbmi extension :(
> 
> Cheers,
> 
> Eelco
> 
> > ---
> 
> Style issues diff:
> 
> diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c index
> 82ff7e647..f798d6708 100644
> --- a/lib/odp-execute-avx512.c
> +++ b/lib/odp-execute-avx512.c
> @@ -20,9 +20,9 @@
> 
>  #include <config.h>
>  #include <errno.h>
> -#include <sys/types.h>
>  #include <netinet/in.h>
>  #include <netinet/ip6.h>
> +#include <sys/types.h>
> 
For FreeBSD, network headers must be included in a certain order. 
See this commit for details:
    b2befd5bb2db ("sparse: Add guards to prevent FreeBSD-incompatible #include 
order.")
So these need to stay in this order. 

>  #include "csum.h"
>  #include "dp-packet.h"
> @@ -547,8 +547,8 @@ avx512_ipv6_sum_header(__m512i ip6_header)
>       * horizontal add. */
>      __m256i v_swap32a = _mm256_setr_epi32(0x0, 0x4, 0xF, 0xF,
>                                            0xF, 0xF, 0xF, 0xF);
> -    v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
> 
> +    v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
>      v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>      v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
> 
> @@ -562,7 +562,7 @@ avx512_ipv6_addr_csum_delta(__m512i old_header,
> __m512i new_header)  {
>      uint16_t old_delta = avx512_ipv6_sum_header(old_header);
>      uint16_t new_delta = avx512_ipv6_sum_header(new_header);
> -    uint32_t csum_delta = (uint16_t)~old_delta + new_delta;
> +    uint32_t csum_delta = ((uint16_t) ~old_delta) + new_delta;
> 
>      return  ~csum_finish(csum_delta);
>  }
> @@ -606,15 +606,15 @@ action_avx512_ipv6_set_addrs(struct
> dp_packet_batch *batch,
>      __m512i v_key_shuf = _mm512_permutexvar_epi8(v_shuffle, v_key);
>      __m512i v_mask_shuf = _mm512_permutexvar_epi8(v_shuffle, v_mask);
> 
> -
>      /* Set the v_zero register to all zero's. */
>      const __m128i v_zeros = _mm_setzero_si128();
> +
>      /* Set the v_all_ones register to all one's. */
>      const __m128i v_all_ones = _mm_cmpeq_epi16(v_zeros, v_zeros);
> 
> -    /* Load ip6 src and dst respectively into 128-bit wide registers. */
> +    /* Load ip6 src and dst masks respectively into 128-bit wide
> + registers. */
>      __m128i v_src = _mm_loadu_si128((void *) mask);
> -    __m128i v_dst = _mm_maskz_loadu_epi64(0xC,(void *) mask);
> +    __m128i v_dst = _mm_maskz_loadu_epi64(0xC, (void *) mask);
> 
>      /* Perform a bitwise OR between src and dst registers. */
>      __m128i v_or = _mm_or_si128(v_src, v_dst);
> 
> > v4:
> >   - Reworked and moved check for checksum outside loop.
> >   - Code cleanup based on review from Eelco.
> > v3:
> >   - Added a runtime check for AVX512 vbmi.
> > v2:
> >   - Added check for availbility of s6_addr32 field of struct in6_addr.
> >   - Fixed network headers for freebsd builds.
> >
> 
> <SNIP>
> 
> > +    /* Load ip6 src and dst respectively into 128-bit wide registers. */
> > +    __m128i v_src = _mm_loadu_si128((void *) mask);
> > +    __m128i v_dst = _mm_maskz_loadu_epi64(0xC,(void *) mask);
> 
> Guess it might be me, but I do not understand how
> _mm_maskz_loadu_epi64() will load the dst from the mask.
> Looking at the intrinsics guide it will only read the first two 64-bit 
> values, but
> mask points to src?
> 
> Should we not just do the following here?
> 
> +    __m128i v_src = _mm_loadu_si128((void *) &mask->ipv6_src);
> +    __m128i v_dst = _mm_loadu_si128((void *) &mask->ipv6_dst);
> 

Yes, good catch. The _maskz_load_ isn't correctly pointing to the dst values 
from the mask. 
 
I will make all the changes above (except the include header comment) and send 
v5 shortly. 

<SNIP>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to