https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96421

            Bug ID: 96421
           Summary: missing __builtin_ia32_pand256 in X86 AVX2 intrinsics
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: f.heckenb...@fh-soft.de
  Target Milestone: ---

% cat test.c       
#include <stdio.h>
#include <string.h>
#include <x86intrin.h>

/*
static __v4di __builtin_ia32_pand256 (__v4di a, __v4di b)
{
  __v4di r;
  __asm__ ("vpand %1, %2, %0" : "=x" (r) : "x" (a), "x" (b));
  return r;
}
*/

int main ()
{
  __v4di a, b, n, o, x;
  memset (&a, 3, sizeof (a));
  memset (&b, 5, sizeof (b));
  n = __builtin_ia32_pand256 (a, b);
  o = __builtin_ia32_por256  (a, b);
  x = __builtin_ia32_pxor256 (a, b);
  printf ("%x %x %x\n", *(int *) &n, *(int *) &o, *(int *) &x);
};

% gcc -mavx2 test.c
test.c: In function 'main':
test.c:21:7: warning: implicit declaration of function
'__builtin_ia32_pand256'; did you mean '__builtin_ia32_pabsd256'?
[-Wimplicit-function-declaration]
   n = __builtin_ia32_pand256 (a, b);
       ^~~~~~~~~~~~~~~~~~~~~~
       __builtin_ia32_pabsd256
test.c:21:5: error: incompatible types when assigning to type '__v4di' {aka
'__vector(4) long long int'} from type 'int'
   n = __builtin_ia32_pand256 (a, b);
     ^

There are 256 bit variants of "or" and "xor", but "and" is missing. I don't see
a reason why; they look very similar.

It can be added as shown in the commented out code, though that's not quite as
efficient as a built-in (e.g. no constant folding).

Reply via email to