------- Comment #28 from amonakov at gcc dot gnu dot org 2010-03-16 15:26 ------- To provide an update of the situation on 4.5 trunk: AFAIK the situation has been generally improved with Zdenek's second commit (in comment #23) and auto-inc-dec improvements in 4.5. However, on the particular testcase discussed here (comment #20), we still don't DTRT on ia64 (powerpc is OK, don't know about arm). There is unfortunate interplay between IVOPTS, RTL PRE, RTL loop analysis and the auto-inc-dec pass.
First, ivopts produce harder-to-grok sequence in loop preheader, transforming <bb 3>: pretmp.2_8 = (short int) v_4(D); <bb 4>: # i_13 = PHI <i_6(5), 0(3)> a[i_13] = pretmp.2_8; i_6 = i_13 + 1; if (len_3(D) > i_6) goto <bb 5>; else goto <bb 6>; <bb 5>: goto <bb 4>; to <bb 3>: pretmp.2_8 = (short int) v_4(D); ivtmp.7_11 = (long unsigned int) &a[0]; D.2006_17 = (unsigned int) len_3(D); D.2007_18 = D.2006_17 + 4294967295; D.2008_19 = (long unsigned int) D.2007_18; D.2009_20 = D.2008_19 * 2; a.13_21 = (long unsigned int) &a; D.2011_22 = a.13_21 + 2; D.2012_23 = D.2009_20 + D.2011_22; <bb 4>: # ivtmp.7_1 = PHI <ivtmp.7_12(5), ivtmp.7_11(3)> D.2005_16 = (void *) ivtmp.7_1; MEM[base: D.2005_16]{a[i]} = pretmp.2_8; ivtmp.7_12 = ivtmp.7_1 + 2; if (ivtmp.7_12 != D.2012_23) goto <bb 5>; else goto <bb 6>; <bb 5>: goto <bb 4>; The preheader is not cleaned up until RTL PRE. Then, PRE transforms L54: 51 NOTE_INSN_BASIC_BLOCK 52 [r371:DI]=r373:SI#0 53 r371:DI=r371:DI+0x2 55 r392:BI=r371:DI!=r381:DI 56 pc={(r392:BI!=0x0)?L54:pc} to L83: 82 NOTE_INSN_BASIC_BLOCK 77 r397:DI=r371:DI+0x2 L54: 51 NOTE_INSN_BASIC_BLOCK 52 [r371:DI]=r373:SI#0 75 r371:DI=r397:DI REG_EQUAL: r371:DI+0x2 55 r392:BI=r371:DI!=r381:DI 56 pc={(r392:BI!=0x0)?L83:pc} REG_DEAD: r392:BI REG_BR_PROB: 0x26ac ... which is something auto-inc-dec pass is not able to handle. If I disable rtl pre with -fdbg-cnt=pre:0, auto-inc is generated, but doloop pass is confused instead: Loop 1 is simple: simple exit 4 -> 5 infinite if: (expr_list:REG_DEP_TRUE (subreg:SI (and:DI (plus:DI (minus:DI (ashift:DI (reg:DI 390) (const_int 1 [0x1])) (reg:DI 371 [ ivtmp.7 ])) (const:DI (plus:DI (symbol_ref:DI ("a") [flags 0x2] <var_decl 0x7ffff7968000 a>) (const_int 2 [0x2])))) (const_int 1 [0x1])) 0) (nil)) number of iterations: (lshiftrt:DI (plus:DI (minus:DI (reg:DI 381 [ D.2012 ]) (reg:DI 371 [ ivtmp.7 ])) (const_int -2 [0xfffffffffffffffe])) (const_int 1 [0x1])) upper bound: -2 Doloop: Possible infinite iteration case. Doloop: The loop is not suitable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32283