On Thu, 2025-10-23 at 11:24 +0100, Andrew Stubbs wrote:
> On 22/10/2025 12:23, Thomas Schwinge wrote:
> > Hi!
> >
> > On 2025-10-21T13:41:33+0530, Avinash Jayakar
> > <[email protected]> wrote:
> > > On Tue, 2025-10-21 at 09:24 +0200, Thomas Schwinge wrote:
> > > > On 2025-10-21T11:46:04+0530, Avinash Jayakar
> > > > <[email protected]>
> > > > wrote:
> > > > > Some targets (aarch64 and x86_64 with multilib) reported
> > > > > regression
> > > > > for some
> > > > > test cases made for PR104116.
> > > >
> > > > Thanks for looking into this.
> > > >
> > > > I've similarly observed for '--target=amdgcn-amdhsa':
> > >
> > > I hope the issue is the same and this patch fixes it.
> >
> > Your commit r16-4535-g01c854c2a3b7d7a3207c3e63caf4e8422174fe96
> > "vect: Fix regression for PR104116" fixed some of those (in
> > particular,
> > the 'umod' ones), see below. (Tested '-march=gfx908' only, so
> > far.)
> >
> > > Is it possible to
> > > run this on x86_64, if so I can run and check this.
> >
> > You'd have to build GCC for '--target=amdgcn-amdhsa', which of
> > course
> > you're welcome to experiment with, but also Andrew (copied) and/or
> > I can
> > try to help resolve the remaining issues.
> >
> > > > +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c (test for excess
> > > > errors)
> > > > +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-ceil-umod-2.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> >
> > Resolved:
> >
> > [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-ceil-umod-2.c scan-
> > tree-dump-times vect "optimized: loop vectorized" 1
> >
> > > > +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c (test for
> > > > excess errors)
> > > > +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c execution
> > > > test
> > > > +FAIL: gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-tree-
> > > > dump-times vect "optimized: loop vectorized" 1
> >
> > Resolved:
> >
> > [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-
> > tree-dump-times vect "optimized: loop vectorized" 1
> >
> > > > +PASS: gcc.dg/vect/pr104116-round-div-2.c (test for excess
> > > > errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-div-2.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-round-div-2.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-div-pow2.c (test for
> > > > excess errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-div-pow2.c execution
> > > > test
> > > > +FAIL: gcc.dg/vect/pr104116-round-div-pow2.c scan-tree-
> > > > dump-times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-div.c (test for excess
> > > > errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-div.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-round-div.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-mod-2.c (test for excess
> > > > errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-mod-2.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-round-mod-2.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c (test for
> > > > excess errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c execution
> > > > test
> > > > +FAIL: gcc.dg/vect/pr104116-round-mod-pow2.c scan-tree-
> > > > dump-times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-mod.c (test for excess
> > > > errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-mod.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-round-mod.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> > > >
> > > > +PASS: gcc.dg/vect/pr104116-round-umod-2.c (test for
> > > > excess errors)
> > > > +PASS: gcc.dg/vect/pr104116-round-umod-2.c execution test
> > > > +FAIL: gcc.dg/vect/pr104116-round-umod-2.c scan-tree-dump-
> > > > times vect "optimized: loop vectorized" 1
> >
> > Resolved:
> >
> > [-FAIL:-]{+PASS:+} gcc.dg/vect/pr104116-round-umod-2.c scan-
> > tree-dump-times vect "optimized: loop vectorized" 1
> >
> > The other ones ('div', 'mod') still don't "optimized: loop
> > vectorized".
> > Andrew, are you able to quickly qualify these as expected vs. not
> > expected to vectorize for GCN? Context:
>
> I think the fails are *not* expected.
>
> The amdgcn backend does not define operators for rounded DIV or MOD
> (both operators require libgcc routines), but IIUC pr104116 is
> supposed
> to transform the code so that that doesn't matter.
>
> For pr104116-ceil-umod-pow2.c I see:
>
> vect_recog_divmod_pattern: detected: _5 = _4 %[cl] 19;
>
> But, for pr104116-round-mod.c, there's no such recognition, no
> transformation, and eventually:
>
> not vectorized: relevant stmt not supported: _5 = _4 %[rd] 19;
>
> I have not investigated why the operator is not recognized, but I do
> not
> believe the FAIL should be expected to fail for amdgcn.
>
Thanks for the info. I also was able to build and debug it for this
target.
For the signed mod case, there is an extra requirement for
vectorization that is to have vector operation support for ABS_EXPR,
which amdgcn does not have.
5033 if (!unsigned_p)
5034 {
5035 // check availibility of abs expression for vector
5036 if (!target_has_vecop_for_code (ABS_EXPR, vectype))
5037 return NULL;
(1) Either we could modify the test case to check if target supports
vector absolute expression like we did for vect_condition.
I am not sure how to do that for now. In target-supports.exp, I do not
see any keyword to use in the test to check if target supports vector
absolute expression.
(2) Or better yet if a target does not support ABS_EXPR we could
fallback to implementing the absolute expression using vectorized
statements
a < 0 ? -a : a; (LT_EXPR, COND_EXPR and NEGATE_EXPR). This would
increase the statements, but the cost analysis of vectorized could
decide in choosing vector vs scalar implementation.
Please do let me know in case you need help in resolving this, I can
submit a patch with the required changes for the second case.
Thanks and regards,
Avinash Jayakar
> Andrew
>
> > > > > [...]
> > > > > The commit gcc-16-4464-g6883d51304f added 30 new tests for
> > > > > testing
> > > > > vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. [...]
> >
> >
> > Grüße
> > Thomas
> >
> >