[Bug tree-optimization/79938] gcc unnecessarily spills xmm register to stack when inserting vector items

2021-08-05 Thread postmaster at raasu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79938 --- Comment #6 from postmaster at raasu dot org --- I tried identical code using intrinsics with both clang and gcc: clang: movdqa xmm1,XMMWORD PTR [rip+0xd98]# 402050 <_IO_stdin_used+0x50> pand xmm1,xmm0 movdqa xmm2,xmm0 pshufb

[Bug tree-optimization/79938] gcc unnecessarily spills xmm register to stack when inserting vector items

2021-08-03 Thread postmaster at raasu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79938 --- Comment #5 from postmaster at raasu dot org --- My brains think it's basically four shuffles and three vector additions. It's part of vectorized adler32 implementation, so there is real-life use for the optimization.

[Bug tree-optimization/79938] gcc unnecessarily spills xmm register to stack when inserting vector items

2021-08-02 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79938 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed|