On 22/10/2025 12:23, Thomas Schwinge wrote:
Hi!

On 2025-10-21T13:41:33+0530, Avinash Jayakar <[email protected]> wrote:
On Tue, 2025-10-21 at 09:24 +0200, Thomas Schwinge wrote:
On 2025-10-21T11:46:04+0530, Avinash Jayakar <[email protected]>
wrote:
Some targets (aarch64 and x86_64 with multilib) reported regression
for some
test cases made for PR104116.

Thanks for looking into this.

I've similarly observed for '--target=amdgcn-amdhsa':

I hope the issue is the same and this patch fixes it.

Your commit r16-4535-g01c854c2a3b7d7a3207c3e63caf4e8422174fe96
"vect: Fix regression for PR104116" fixed some of those (in particular,
the 'umod' ones), see below.  (Tested '-march=gfx908' only, so far.)

Is it possible to
run this on x86_64, if so I can run and check this.

You'd have to build GCC for '--target=amdgcn-amdhsa', which of course
you're welcome to experiment with, but also Andrew (copied) and/or I can
try to help resolve the remaining issues.

     +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c execution test
     +FAIL: gcc.dg/vect/pr104116-ceil-umod-2.c scan-tree-dump-times vect "optimized: 
loop vectorized" 1

Resolved:

     [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-ceil-umod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c execution test
     +FAIL: gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

Resolved:

     [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-div-2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-div-2.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-div-2.c scan-tree-dump-times vect "optimized: 
loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-div-pow2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-div-pow2.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-div-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-div.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-div.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-div.c scan-tree-dump-times vect "optimized: 
loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-mod-2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-mod-2.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-mod-2.c scan-tree-dump-times vect "optimized: 
loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-mod-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-mod.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-mod.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-mod.c scan-tree-dump-times vect "optimized: 
loop vectorized" 1

     +PASS: gcc.dg/vect/pr104116-round-umod-2.c (test for excess errors)
     +PASS: gcc.dg/vect/pr104116-round-umod-2.c execution test
     +FAIL: gcc.dg/vect/pr104116-round-umod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

Resolved:

     [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-round-umod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

The other ones ('div', 'mod') still don't "optimized: loop vectorized".
Andrew, are you able to quickly qualify these as expected vs. not
expected to vectorize for GCN?  Context:

I think the fails are *not* expected.

The amdgcn backend does not define operators for rounded DIV or MOD (both operators require libgcc routines), but IIUC pr104116 is supposed to transform the code so that that doesn't matter.

For pr104116-ceil-umod-pow2.c I see:

  vect_recog_divmod_pattern: detected: _5 = _4 %[cl] 19;

But, for pr104116-round-mod.c, there's no such recognition, no transformation, and eventually:

  not vectorized: relevant stmt not supported: _5 = _4 %[rd] 19;

I have not investigated why the operator is not recognized, but I do not believe the FAIL should be expected to fail for amdgcn.

Andrew

[...]
The commit gcc-16-4464-g6883d51304f added 30 new tests for testing
vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. [...]


Grüße
  Thomas



Reply via email to