http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57052



             Bug #: 57052

           Summary: missed optimization with rotate and mask

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: normal

          Priority: P3

         Component: target

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: amo...@gmail.com





/* -m32 -O -S */

int

foo (unsigned int x, int r)

{

  return ((x << r) | (x >> (32 - r))) & 0xff;

}



results in:



foo:

    rlwnm 3,3,4,0xffffffff

    rlwinm 3,3,0,24,31

    blr



Compiling the same code with -m32 -O -S -mlittle gives the properly optimized

result of:



foo:

    rlwnm 3,3,4,0xff

    blr



This is because many of the rs6000.md rotate/shift and mask patterns use

subregs with wrong byte offsets.  eg. rotlsi3_internal7, the insn that ought to

match here, has (subreg:QI (rotate:SI ...) 0).  The 0 selects the most

significant byte when BYTES_BIG_ENDIAN and the least significant when

!BYTES_BIG_ENDIAN.



Fortunately combine doesn't seem to generate subregs for high parts, so

changing the testcase mask to 0xff000000 doesn't result in wrong code.



Annoyingly, rotlsi3_internal4 would match here too if combine_simplify_rtx()

didn't simplify (set (reg:SI) (and:SI () 255)) to use subregs.

Reply via email to