http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46854

Michael Meissner <meissner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2010.12.09 17:56:39
                 CC|                            |meissner at gcc dot gnu.org
     Ever Confirmed|0                           |1

--- Comment #2 from Michael Meissner <meissner at gcc dot gnu.org> 2010-12-09 
17:56:39 UTC ---
Note, -O2 generates mostly the code you want, except that it looks the address
of the string twice:

Here is the code generated with a 4.4.4 based compiler (the compiler happens to
be the IBM advance toolchain, version 3.0-1) using -O2 -m32 (-O1/-O3 generate
the same code):

test:
        mr. 0,3
        mtctr 0
        beq 0,.L10
        lis 3,.lanch...@ha
        la 3,.lanch...@l(3)
        .p2align 4,,15
.L8:
        lbzu 0,1(3)
        cmpwi 7,0,0
        bne 7,.L8
        bdnz .L8
        blr
.L10:
        lis 3,.lanch...@ha
        la 3,.lanch...@l(3)
        blr

The SLES 11SP1 system compiler, which is based on GCC 4.3.4 generates the same
code.

However, the GCC 4.6 trunk seems to have regressed slightly with -O2 or -O3, in
that it does not track that the lbzu updates the pointer, but maintains its own
copy:

        mr. 0,3
        mtctr 0
        beq- 0,.L5
        lis 3,.lanch...@ha
        la 3,.lanch...@l(3)
.L4:
        mr 9,3
.L3:
        lbzu 0,1(9)
        addi 3,3,1
        cmpwi 7,0,0
        bne+ 7,.L3
        bdnz .L4
        blr
.L5:
        lis 3,.lanch...@ha
        la 3,.lanch...@l(3)
        blr

Trunk with -Os does generate the two comparisons:

        mr 9,3
        lis 3,.lanch...@ha
        la 3,.lanch...@l(3)
        b .L2
.L5:
        mr 11,3
        addi 3,3,1
        lbz 0,1(11)
        cmpwi 7,0,0
        bne+ 7,.L5
        addi 9,9,-1
.L2:
        cmpwi 7,9,0
        bne+ 7,.L5
        blr

So, there are two bugs in this.  One that -Os generates larger code than -O2,
and the code regression for GCC 4.6.

Reply via email to