[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 --- Comment #6 from ikonomisma at googlemail dot com --- (In reply to Richard Biener from comment #5) > OK, so I can compile the testcase now but I fail to see the error. We're > doing pointer difference compares and those should work out fine? > > We're also doig many checks but you probably refer to the very first test? > > _Z12brokenvectorRKSt6vectorIiSaIiEES3_: > .LFB2470: > .cfi_startproc > movq(%rsi), %rdx > movq8(%rdi), %rsi > xorl%r8d, %r8d > movq(%rdi), %rax > movq%rsi, %rcx > subq%rax, %rcx > cmpq$12, %rcx > jle .L18 > > that's created from > >[local count: 1073741824]: > _7 = MEM[(int * *)b_2(D)]; > _6 = MEM[(int * *)a_3(D) + 8B]; > _4 = MEM[(int * *)a_3(D)]; > _10 = _6 - _4; > if (_10 > 12) > > what are the actual pointers here? So the structure of the code is like this: - function label - function prologue - test whether less than or equal 12 bytes (3 or less ints) are to be processed, jump to SIMD vector prologue - unrolled scalar loop - test whether less than or equal 12 bytes remain to be processed - jump back to scalar loop if more of the vector remains to be processed - SIMD vector prologue testing whether enough of the vector remains unprocessed to warrant vectorized execution. This will effectively never be the case To see the problem, you could call the function (non-inlined) in a test program with a reasonably large vector. Run under gdb, set a breakpoint on one of the instructions in the SIMD-vector code, run. You'll find the SIMD code never gets executed.
[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 --- Comment #5 from Richard Biener --- OK, so I can compile the testcase now but I fail to see the error. We're doing pointer difference compares and those should work out fine? We're also doig many checks but you probably refer to the very first test? _Z12brokenvectorRKSt6vectorIiSaIiEES3_: .LFB2470: .cfi_startproc movq(%rsi), %rdx movq8(%rdi), %rsi xorl%r8d, %r8d movq(%rdi), %rax movq%rsi, %rcx subq%rax, %rcx cmpq$12, %rcx jle .L18 that's created from [local count: 1073741824]: _7 = MEM[(int * *)b_2(D)]; _6 = MEM[(int * *)a_3(D) + 8B]; _4 = MEM[(int * *)a_3(D)]; _10 = _6 - _4; if (_10 > 12) what are the actual pointers here?
[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 --- Comment #4 from ikonomisma at googlemail dot com --- I can reproduce this on both x86_64 and AArch64 Ubuntu 19.10. According to an answer to my question on stack overflow (https://stackoverflow.com/a/59995702/3185968), using std::transform without an execution policy requires a recent libstdc++ (https://github.com/gcc-mirror/gcc/commit/b93041f0d3c9a2fc64f0f5fb538e78d5e2001d32). The issue is reproducible on godbolt.org: https://godbolt.org/z/fZdAqp
[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 --- Comment #3 from ikonomisma at googlemail dot com --- (In reply to Richard Biener from comment #2) > Hmm, I get > > /home/space/rguenther/install/gcc-9.2/include/c++/9.2.0/pstl/execution_defs. > h:155:7: error: no type named ‘type’ in ‘struct std::enable_if’ > 155 | using __enable_if_execution_policy = > | ^~~~ > > trying to compile this with the FSF 9.2.0 release. Well, as far as I understand there should be an overload without an execution policy, both https://en.cppreference.com/w/cpp/algorithm/transform_reduce and the n4659 c++ draft standard include it.
[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 Richard Biener changed: What|Removed |Added Keywords||missed-optimization CC||rguenth at gcc dot gnu.org Blocks||53947 --- Comment #2 from Richard Biener --- Hmm, I get /home/space/rguenther/install/gcc-9.2/include/c++/9.2.0/pstl/execution_defs.h:155:7: error: no type named ‘type’ in ‘struct std::enable_if’ 155 | using __enable_if_execution_policy = | ^~~~ trying to compile this with the FSF 9.2.0 release. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug tree-optimization/93440] scalar unrolled loop makes vectorized code unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93440 --- Comment #1 from ikonomisma at googlemail dot com --- Created attachment 47712 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47712=edit Minimal example c++ source code to trigger unreachable SIMD vector code