** Description changed:
-
[Impact]
- This bug causes data corruption in the ARM64 code compiled with Scalable
Vector Extensions (SVE) enabled for the 256-bit SVE processor but executed on
128-bit SVE processors.
- Example is AWS workload built for Graviton3, but executed on Graviton4.
+ This bug causes data corruption in the ARM64 code compiled with Scalable
Vector Extensions (SVE) enabled for the 256-bit SVE processor but executed on
128-bit SVE processors.
+ Example is AWS workload built for Graviton3, but executed on Graviton4.
When the compiler was compiling the ~ConstA (Not ConstA) expression to
compute the index into the vector it was actually computing -ConstA
(minus ConstA), e.g. ~4 instead of -5 produced -4.
Graviton 4 processes a 256-bit vector in two passes. For the second
pass it runs into this bug when computing indices into the second half
of the vector and ends up with {-4, -5, -6, -7}, processing the last
element of the first half twice and never touching the last element of
the vector.
This data corruption may cause data loss, failing checksums, and
potentially security issues.
[Test Plan]
I was using Raspberry PI 5 for testing, but any other ARM64 platform or
virtual machine will be sufficient.
Install QEMU in noble:
apt install qemu-user-static
Launch lxd vm for the affected release, e.g.
lxc launch ubuntu-daily:jammy tester
lxc file push test.c tester/home/ubuntu/
Install affected gcc:
lxc exec tester -- /bin/sh -c "apt-get update && apt-get install -y gcc-9"
Compile the reproducer[1]:
lxc exec tester -- /bin/sh -c "gcc-9 -fno-inline -O3 -Wall
-fno-strict-aliasing -march=armv8.4-a+sve -o /home/ubuntu/final
/home/ubuntu/test.c”
Fetch the reproducer:
lxc file pull tester/home/ubuntu/final final
Execute the testcase:
qemu-aarch64-static -cpu neoverse-n2 ./final
The testcase will output:
PASS: got 0x00bbbbbb 0x00aaaaaa as expected
If the bug is fixed and
- ERROR: expected 0x00bbbbbb 0x00aaaaaa but got 0x00bbbbbb 0xaaaaaa00
+ ERROR: expected 0x00bbbbbb 0x00aaaaaa but got 0x00bbbbbb 0xaaaaaa00
otherwise.
[Where the problems can occur]
The issue is a typo in the code that is used to calculate offset into
the vector.
The already corrupted data (e.g. checksums) calculated by the affected
code will not match with the values produced after the fix. This may
cause the end user to rebuild the indices relying on the calculated hash
values after their workloads are recompiled by the fixed gcc.
[Other info]
Focal fixes will be done through the -pro updates.
I have ran the test case set Invalid for the versions that are not
affected by this issue.
Affected:
All gcc-8[2]
All gcc-9[2]
All gcc-11[2]
- Noble and down Gcc-12
+ Noble and down Gcc-12
Noble and down Gcc-13
Noble and down Gcc-14
Gcc-15 is not affected
+ The fixed packages will be uploaded to the stable PPA[3] created for this
SRU.
+ The PPA depends on -security only. The packages will need to be binary-copied
to -updates and -security.
+
[1]
https://bugs.launchpad.net/ubuntu/plucky/+source/gcc-14/+bug/2101084/comments/39
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118976#c21
-
+ [3] https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/lp-2101084
Original Description:
[Impact]
This issue affects SVE vectorization on arm64 platforms, specifically in
cases where bitwise-not operations are applied during optimization.
[Fix]
This issue has been resolved by an upstream patch.
commit 78380fd7f743e23dfdf013d68a2f0347e1511550
Author: Richard Sandiford <[email protected]>
Date: Tue Mar 4 10:44:35 2025 +0000
Fix folding of BIT_NOT_EXPR for POLY_INT_CST [PR118976]
There was an embarrassing typo in the folding of BIT_NOT_EXPR for
POLY_INT_CSTs: it used - rather than ~ on the poly_int. Not sure
how that happened, but it might have been due to the way that
~x is implemented as -1 - x internally.
gcc/
PR tree-optimization/118976
* fold-const.cc (const_unop): Use ~ rather than - for
BIT_NOT_EXPR.
* config/aarch64/aarch64.cc (aarch64_test_sve_folding): New
function.
(aarch64_run_selftests): Run it.
[Test Plan]
1. Launch an instance using the latest generation of Graviton processors
(Graviton4).
2. Compile the following code using the command `gcc -O3
-march=armv8.1-a+sve`:
#include <stdint.h>
#include <stdio.h>
#ifndef NCOUNTS
#define NCOUNTS 2
#endif
typedef struct {
uint32_t state[5];
uint32_t count[NCOUNTS];
unsigned char buffer[64];
} SHA1_CTX;
void finalcount_av(SHA1_CTX *restrict ctx, unsigned char *restrict
finalcount) {
// ctx->count is: uint32_t count[2];
int count_idx;
for (int i = 0; i < 4*NCOUNTS; i++) {
count_idx = (4*NCOUNTS - i - 1)/4; // generic but equivalent for
NCOUNTS==2.
finalcount[i] = (unsigned char)((ctx->count[count_idx] >> ((3-(i & 3))
* 8) ) & 255);
}
}
void finalcount_bv(SHA1_CTX *restrict ctx, unsigned char *restrict
finalcount) {
for (int i=0; i < 4*NCOUNTS; i += 4) {
int ci = (4*NCOUNTS - i - 1)/4;
finalcount[i+0] = (unsigned char)((ctx->count[ci] >> (3 * 8) ) & 255);
finalcount[i+1] = (unsigned char)((ctx->count[ci] >> (2 * 8) ) & 255);
finalcount[i+2] = (unsigned char)((ctx->count[ci] >> (1 * 8) ) & 255);
finalcount[i+3] = (unsigned char)((ctx->count[ci] >> (0 * 8) ) & 255);
}
}
int main() {
unsigned char fa[NCOUNTS*4];
unsigned char fb[NCOUNTS*4];
uint32_t *for_print;
int i;
SHA1_CTX ctx;
ctx.count[0] = 0xaaaaaa00;
ctx.count[1] = 0xbbbbbb00;
if (NCOUNTS >2 ) ctx.count[2] = 0xcccccc00;
if (NCOUNTS >3 ) ctx.count[3] = 0xdddddd00;
finalcount_av(&ctx, fa);
finalcount_bv(&ctx, fb);
int ok = 1;
for (i=0; i<NCOUNTS*4; i++) {
ok &= fa[i] == fb[i];
}
if (!ok) {
for_print = (uint32_t*)fb;
printf("ERROR: expected ");
for (i=0; i<NCOUNTS; i++) {
printf("0x%08x ",for_print[i]);
}
for_print = (uint32_t*)fa;
printf("but got ");
for (i=0; i<NCOUNTS; i++) {
printf("0x%08x ",for_print[i]);
}
printf("\n");
return 1;
} else {
for_print = (uint32_t*)fa;
printf("PASS: got ");
for (i=0; i<NCOUNTS; i++) {
printf("0x%08x ",for_print[i]);
}
printf("as expected\n");
return 0;
}
}
3. Verify that the execution output does not contain the string "ERROR".
[Where problems could occur]
The issue is caused by a typo. If any regressions occur, they are expected to
impact only specific partial instructions under certain scenarios, rather than
disrupting the overall functionality.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2101084
Title:
GCC produces wrong code for arm64+sve in some cases
To manage notifications about this bug go to:
https://bugs.launchpad.net/gcc/+bug/2101084/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs