[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #10 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650833.html
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #9 from Andrew Pinski --- Created attachment 58022 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58022&action=edit Patch which I tested I still need to add the testcases and finish up the commit message and changelogs. I will do that tomorrow. Posting this here tonight so I don't lose the patch.
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #8 from Andrew Pinski --- (In reply to Andrew Pinski from comment #7) > here is a testcase for the fcsel usage for integer cmov: A slightly better example where there is no use of inline-asm or forcing to specific registers: ``` #define vector16 __attribute__((vector_size(16))) void foo (int a, int *b, vector16 int c, vector16 int d) { int t = a ? c[0] : d[0]; *b = t; } ``` We should be able to produce: ``` foo: cmp w0, 0 fcsel s1, s1, s0, eq str s1, [x1] ret ``` And here is a decent one for float modes (-O2 -fno-ssa-phiopt is needed though, otherwise the tree level does the VCE after the cmov): ``` #define vector8 __attribute__((vector_size(8))) void foo (int a, double *b, long long c, long long d) { double ct; double dt; __builtin_memcpy(&ct, &c, sizeof(long long)); __builtin_memcpy(&dt, &d, sizeof(long long)); double t = a ? ct : dt; *b = t; } ```
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #7 from Andrew Pinski --- here is a testcase for the fcsel usage for integer cmov: ``` void foo (int a, int *b) { int t = a ? 11 : 22; register int tt __asm__("s0"); tt = t; asm("":"+w"(tt)); *b = tt; } ```
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #6 from Andrew Pinski --- here is a testcase for the fcsel usage for integer: ``` void foo (int a, double *b) { double t = a ? 1.0 : 200.0; register double tt __asm__("x0"); tt = t; asm("":"+r"(tt)); *b = tt; } ```
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #5 from Andrew Pinski --- So adding the `r` alternative to *cmov_insn (GPF) works kinda of but then we seem to have a register allocation issue. Even this still causes FPREGS from being chosen: ``` void foo (int a, double *b) { double t = a ? 1.0 : 200.0; asm("":"+r"(t)); *b = t; } ``` Someone else will need to look into register allocator issue later on. I did find a testcase where we don't get the fmovs though (which forces to use x0). ``` void foo (int a, double *b) { double t = a ? 1.0 : 200.0; register double tt __asm__("x0"); tt = t; asm("":"+r"(tt)); *b = tt; } ``` With that we now get: ``` cmp w0, 0 mov x0, 149533581377536 mov x2, 4641240890982006784 movkx0, 0x40c3, lsl 48 cselx0, x2, x0, eq str x0, [x1] ret ``` So at least I can write up a testcase ...
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Andrew Pinski --- I am going to look into this ...
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2020-12-30 Version|unknown |11.0 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski --- .
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #2 from Andrew Pinski --- (In reply to ktkachov from comment #1) > Or a =r,r,r alternative to the FCSEL pattern instead... Should most likely add the r alternative to *cmov_insn (GPF) and the w alternative to *cmov_insn (ALLI). So you can avoid moving back and forth in general.
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement CC||pinskia at gcc dot gnu.org
[Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477 --- Comment #1 from ktkachov at gcc dot gnu.org --- Or a =r,r,r alternative to the FCSEL pattern instead...