https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123626
Jeffrey A. Law <law at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rdapp at gcc dot gnu.org
--- Comment #2 from Jeffrey A. Law <law at gcc dot gnu.org> ---
So the rv64 failure can be fixed by fixing a long standing bug that Richard S
hinted at about a year ago.
Basically we don't properly handle that VXRM is clobbered by calls in the
RISC-V backend. As a result we think that the VXRM settings are transparent
through calls for the sake of VXRM placement and ultimately it ends up either
getting removed as unnecessary or ends up in the wrong block because the
dataflow related to VXRM is just wrong. I'll commit a patch for that shortly.
That's the good news. The bad news is something else is still goofy here.
Let's take a slightly simpler testcase (with some of the initial loops
removed):
short a;
long long b;
char c[3][3][17];
_Bool d;
int main() {
for (char g=0; g<3; g-=13)
for (short j=3; j<d+21; j+=2)
a += 2;
if (a != 180)
__builtin_abort ();
__builtin_exit (0);
}
Compile with rv32-elf, -O3 -march=rv32gcv_zvl256b -fsigned-char
-fno-strict-aliasing -fwrapv and you'll end up calling abort rather than
exiting. newlib's abort just returns nonzero, so you have to query exit
status...
./xgcc -B./ -O3 -march=rv32gcv_zvl256b -fsigned-char -fno-strict-aliasing
-fwrapv j.c -c && riscv32-elf-gcc j.o && ./a.out ; echo $?
1
If I bisect that smaller testcase, it starts failing at the point where we
start vectorizing it:
commit bc10ba9c33bb0cef335e8e0072e638fd4d404337
Author: Richard Biener <[email protected]>
Date: Thu Nov 6 14:24:34 2025 +0100
tree-optimization/122577 - missed vectorization of conversion from bool
We are currently overly restrictive with rejecting conversions from
bit-precision entities to mode precision ones. Similar to RTL expansion
we can focus on non-bit operations producing bit-precision results
which we currently do not properly handle by masking. Such checks
should be already present. The following relaxes vectorizable_conversion.
Actual bitfield accesses are catched and rejected by vectorizer dataref
analysis and converted during if-conversion into mode-size accesses
with appropriate sign- or zero-extension.
PR tree-optimization/122577
* tree-vect-stmts.cc (vectorizable_conversion): Allow conversions
from non-mode-precision types.
* gcc.dg/vect/vect-bool-3.c: New testcase.
I haven't tried to dive info the guts of the vectorized code yet.