https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101096
Bug ID: 101096 Summary: AVX512 VPMOV instruction should be used to downconvert vectors Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcases should use VPMOV downconvert instruction with AVX512VL: void foo (unsigned short* p1, unsigned short* p2, char* __restrict p3) { for (int i = 0 ; i != 16; i++) p3[i] = p1[i] + p2[i]; return; } void foo1 (unsigned int* p1, unsigned int* p2, short* __restrict p3) { for (int i = 0 ; i != 8; i++) p3[i] = p1[i] + p2[i]; return; } gcc -O3 -mavx512vl: foo: vpbroadcastw .LC1(%rip), %xmm0 vpand 16(%rsi), %xmm0, %xmm2 vpand (%rsi), %xmm0, %xmm1 vpackuswb %xmm2, %xmm1, %xmm1 vpand (%rdi), %xmm0, %xmm2 vpand 16(%rdi), %xmm0, %xmm0 vpackuswb %xmm0, %xmm2, %xmm0 vpaddb %xmm0, %xmm1, %xmm0 vmovdqu %xmm0, (%rdx) ret foo1: vpbroadcastd .LC3(%rip), %xmm0 vpand 16(%rsi), %xmm0, %xmm2 vpand (%rsi), %xmm0, %xmm1 vpackusdw %xmm2, %xmm1, %xmm1 vpand (%rdi), %xmm0, %xmm2 vpand 16(%rdi), %xmm0, %xmm0 vpackusdw %xmm0, %xmm2, %xmm0 vpaddw %xmm0, %xmm1, %xmm0 vmovdqu %xmm0, (%rdx) ret