https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97059

--- Comment #1 from Dimitri Gorokhovik <dimitri.gorokhovik at free dot fr> ---
I was able to reduce same code (attached file bug-6.cpp).

-- when compiled correctly, running it produces the following (expected)
output:

cube: ({ 0, 0, 0 }, { 1, 1, 1 }) 
cube: ({ 0, 0, 1 }, { 1, 1, 2 }) 
cube: ({ 0, 0, 2 }, { 1, 1, 3 }) 
cube: ({ 0, 1, 0 }, { 1, 2, 1 }) 
cube: ({ 0, 1, 1 }, { 1, 2, 2 }) 
cube: ({ 0, 1, 2 }, { 1, 2, 3 }) 
cube: ({ 0, 2, 0 }, { 1, 3, 1 }) 
cube: ({ 0, 2, 1 }, { 1, 3, 2 }) 
cube: ({ 0, 2, 2 }, { 1, 3, 3 }) 
cube: ({ 1, 0, 0 }, { 2, 1, 1 }) 
cube: ({ 1, 0, 1 }, { 2, 1, 2 }) 
cube: ({ 1, 0, 2 }, { 2, 1, 3 }) 
cube: ({ 1, 1, 0 }, { 2, 2, 1 }) 
cube: ({ 1, 1, 1 }, { 2, 2, 2 }) 
cube: ({ 1, 1, 2 }, { 2, 2, 3 }) 
cube: ({ 1, 2, 0 }, { 2, 3, 1 }) 
cube: ({ 1, 2, 1 }, { 2, 3, 2 }) 
cube: ({ 1, 2, 2 }, { 2, 3, 3 }) 
cube: ({ 2, 0, 0 }, { 3, 1, 1 }) 
cube: ({ 2, 0, 1 }, { 3, 1, 2 }) 
cube: ({ 2, 0, 2 }, { 3, 1, 3 }) 
cube: ({ 2, 1, 0 }, { 3, 2, 1 }) 
cube: ({ 2, 1, 1 }, { 3, 2, 2 }) 
cube: ({ 2, 1, 2 }, { 3, 2, 3 }) 
cube: ({ 2, 2, 0 }, { 3, 3, 1 }) 
cube: ({ 2, 2, 1 }, { 3, 3, 2 }) 
cube: ({ 2, 2, 2 }, { 3, 3, 3 }) 
count = 27

-- when compiled incorrectly, it prints out:

count = 0

Tested with build g++ (GCC) 11.0.0 20200924 (experimental).


In order to compile and run:

g++ -std=c++17 -O3 -o bug-6 bug-6.cpp && ./bug-6

This builds for implicit '-m64' (x86_64) and produces invalid output. 

To get valid output, compile with either of the following:
-m32
-O0 (instead of -O3)
-fno-tree-sra
one of -DFIX_0 to -DFIX_4 


>From my limited understanding of tree dumps, here is what roughly happens:

-- the routine 'begin()', line 183, returns 'struct iterator' by value. The
latter has the size of 14 bytes so returned "in registers". Forcing it to be
returned via memory ==> issue goes away. (Methods to force: make bigger than 16
bytes, make volatile, use -m32). Note also that, when the routine is evaluated
as constexpr (in static_assert), the issue is not reproduced.

-- all called routines are inlined inside one call, to 'count_them'. Prevent
the inlining of the routine 'can_be_incremented ()' ==>  issue goes away.
(Methods to prevent: define FIX_1.)

-- SRA replaces several fields of the 'struct iterator' (line 150), notably,
'idx_' (line 153). Disable SRA ==> issue goes away (-fno-tree-sra or use -O0). 

This replacement by tree-SRA somehow doesn't propagate the writes to idx_, from
the replacement vars to the original part of the structure which lives "in the
return registers".  When the return value lives in memory, the writes are
propagated correctly.

The compiler then eliminates the loop in 'can_be_incremented' and evaluates the
call to that routine by 'false' (line 163). Forcibly keeping the loop (-DFIX_2)
or replacing it by non-loop code (-DFIX_0) ==> issue goes away.

Reply via email to