https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198

            Bug ID: 91198
           Summary: GCC not generating AVX-512 compress/expand
                    instructions
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: moritz.kreutzer at siemens dot com
  Target Milestone: ---

We have a simple loop to select values based on a condition from one array and
store the selected values contiguously in a second array:


https://godbolt.org/z/T7UXXD
================================
float const threshold = 0.5; 
int o = 0;
for (int i = 0; i < size; ++i) {
  if (input[i] < threshold) {
    output[o] = input[i];
    o++;
  } 
}
================================

It seems like GCC is not able to generate AVX-512 assembly using vcompressps
instructions for this code. The same holds true for the orthogonal pattern
(expansion using vexpandps). Is this a missed optimization in GCC or is there
another issue in the example code which prevents vectorization?

Reply via email to