[Bug target/103850] missed optimization in AVX code

2022-01-04 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850 --- Comment #6 from Martin Reinecke --- I would have expected that this does not make a significant difference, assuming that speculative execution works and the branch predictor takes the jump backwards at the loop's end. In that picture both v

[Bug target/103850] missed optimization in AVX code

2021-12-28 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850 --- Comment #3 from Martin Reinecke --- Just for completeness, this is the CPU I'm running on: vendor_id : AuthenticAMD cpu family : 23 model : 96 model name : AMD Ryzen 7 4800H with Radeon Graphics stepping: 1

[Bug target/103850] missed optimization in AVX code

2021-12-28 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850 --- Comment #2 from Martin Reinecke --- Thanks! This flag indeed causes both kernels to have the same speed, but (at least for me) it's slower than both original versions... slow kernel version: 29.027915 GFlops/s fast kernel version: 29.008313

[Bug c/103850] New: missed optimization in AVX code

2021-12-28 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850 Bug ID: 103850 Summary: missed optimization in AVX code Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assi

[Bug libstdc++/103805] Inconsistent exception specifications

2021-12-22 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805 --- Comment #6 from Martin Reinecke --- Ouch. That reminds me when Redhat(?) did the same many years ago and caused no end of confusion. Anyway, sorry for the noise!

[Bug libstdc++/103805] Inconsistent exception specifications

2021-12-22 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805 --- Comment #4 from Martin Reinecke --- Sorry if I specified the wrong version. My local (Debian unstable) g++ reports martin@marvin:~/codes/ducc$ g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11

[Bug libstdc++/103805] New: Inconsistent exception specifications

2021-12-22 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805 Bug ID: 103805 Summary: Inconsistent exception specifications Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc

[Bug tree-optimization/99728] code pessimization when using wrapper classes around SIMD types

2021-07-02 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 --- Comment #12 from Martin Reinecke --- Any hope of addressing this for gcc 12? I have a real-world test case where this effect causes roughly 15-20% slowdown, and I expect that with the wider availability of std::simd types more people will enc

[Bug c++/99728] code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 --- Comment #7 from Martin Reinecke --- Thanks! (BTW, I'm aware your code and will immediately switch to it once it lands in gcc! But for the time being I try to make do with my poor man's version to avoid the external dependency.)

[Bug c++/99728] code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 --- Comment #5 from Martin Reinecke --- (In reply to Matthias Kretz (Vir) from comment #4) > FWIW, using std::experimental::native_simd also does not hoist the > stores out of the loop. However, if you pass d by value and return d, the > issue go

[Bug c++/99728] code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 --- Comment #2 from Martin Reinecke --- Created attachment 50458 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50458&action=edit additional test case by Alexander Monakov

[Bug c++/99728] code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 --- Comment #1 from Martin Reinecke --- Created attachment 50457 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50457&action=edit generated assembler

[Bug c++/99728] New: code pessimization when using wrapper classes around SIMD types

2021-03-23 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728 Bug ID: 99728 Summary: code pessimization when using wrapper classes around SIMD types Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/98544] [11 regression] Wrong code generated by tree vectorizer since r11-3917-g28290cb50c7dbf87

2021-01-08 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544 --- Comment #22 from Martin Reinecke --- Brilliant, thank you very much for tracking this one down! My FFT library now works correctly again with all optimizations enabled, which is a great relief. The scipy maintainers will be happy that they wo

[Bug tree-optimization/98544] [11 regression] Wrong code generated by tree vectorizer since r11-3917-g28290cb50c7dbf87

2021-01-07 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544 --- Comment #15 from Martin Reinecke --- "Problem at length N" means that the FFT of length N is computed incorrectly. Also, N==l1*ido*x. For an FFT of length N, the computation is broken down into several passes. Let's take N=15. First the prom

[Bug tree-optimization/98544] [11 regression] Wrong code generated by tree vectorizer since r11-3917-g28290cb50c7dbf87

2021-01-07 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544 --- Comment #13 from Martin Reinecke --- > What kind of shape (w/o too much guessing) is the function expecting for its > input arrays? For radb the size of the cc and ch arrays is l1*ido*x. Size of wa is (x-1)*ido.

[Bug tree-optimization/98544] [11 regression] Wrong code generated by tree vectorizer

2021-01-05 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544 --- Comment #1 from Martin Reinecke --- Problem seems to be related to the use of __restrict__. If I remove the DUCC0_RESTRICT from the function definitions of "radb3", "radb4" etc., the problem goes away. However I don't see where I'm violatin

[Bug c++/98544] New: [11 regression] Wrong code generated by tree vectorizer

2021-01-05 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544 Bug ID: 98544 Summary: [11 regression] Wrong code generated by tree vectorizer Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priorit

[Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853

2021-01-05 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516 --- Comment #9 from Martin Reinecke --- Thanks, this fixes the reduced test case for me as well! Unfortunately there seems to be more where this one came from, since my comprehensive test suite still fails ... I'll try to produce test cases and

[Bug tree-optimization/98516] Wrong code generated by tree vectorizer

2021-01-04 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516 --- Comment #1 from Martin Reinecke --- Minimal set of flags to trigger the problem seems to be g++ -std=c++17 -O1 -ftree-vectorize -fno-signed-zeros bug.cc

[Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer

2021-01-04 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516 Bug ID: 98516 Summary: Wrong code generated by tree vectorizer Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optim

[Bug c++/97564] New: [11.0 regression] pybind11 compilation failure

2020-10-24 Thread martin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97564 Bug ID: 97564 Summary: [11.0 regression] pybind11 compilation failure Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++