On Thursday, January 21, 2016 09:21:52 AM Richard Biener wrote:
> On Thu, 21 Jan 2016, Thomas Preud'homme wrote:
> > On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote:
> > > On Tue, 5 Jan 2016, Thomas Preud'homme wrote:
> > > > Hi,
> > > > 
> > > > bswap optimization pass generate wrong code on big endian targets when
> > > > the
> > > > result of a bit operation it analyzed is a partial load of the range
> > > > of
> > > > memory accessed by the original expression (when one or more bytes at
> > > > lowest address were lost in the computation). This is due to the way
> > > > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being
> > > > compared to the result of the symbolic expression. Part of the
> > > > adjustment
> > > > is endian independent: it's to ignore the bytes that were not accessed
> > > > by
> > > > the original gimple expression. However, when the result has less byte
> > > > than that original expression, some more byte need to be ignored and
> > > > this
> > > > is endian dependent.
> > > > 
> > > > The current code only support loss of bytes at the highest addresses
> > > > because there is no code to adjust the address of the load. However,
> > > > for
> > > > little and big endian targets the bytes at highest address translate
> > > > into
> > > > different byte significance in the result. This patch first separate
> > > > cmpxchg and cmpnop adjustement into 2 steps and then deal with
> > > > endianness
> > > > correctly for the second step.
> > > > 
> > > > ChangeLog entries are as follow:
> > > > 
> > > > 
> > > > *** gcc/ChangeLog ***
> > > > 
> > > > 2015-12-16  Thomas Preud'homme  <thomas.preudho...@arm.com>
> > > > 
> > > >         PR tree-optimization/67781
> > > >         * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in
> > > >         cmpxchg
> > > >         and cmpnop in two steps: first the ones not accessed in
> > > >         original
> > > >         gimple expression in a endian independent way and then the
> > > >         ones
> > > >         not
> > > >         accessed in the final result in an endian-specific way.
> > > > 
> > > > *** gcc/testsuite/ChangeLog ***
> > > > 
> > > > 2015-12-16  Thomas Preud'homme  <thomas.preudho...@arm.com>
> > > > 
> > > >         PR tree-optimization/67781
> > > >         * gcc.c-torture/execute/pr67781.c: New file.
> > > > 
> > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > new file mode 100644
> > > > index 0000000..bf50aa2
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > @@ -0,0 +1,34 @@
> > > > +#ifdef __UINT32_TYPE__
> > > > +typedef __UINT32_TYPE__ uint32_t;
> > > > +#else
> > > > +typedef unsigned uint32_t;
> > > > +#endif
> > > > +
> > > > +#ifdef __UINT8_TYPE__
> > > > +typedef __UINT8_TYPE__ uint8_t;
> > > > +#else
> > > > +typedef unsigned char uint8_t;
> > > > +#endif
> > > > +
> > > > +struct
> > > > +{
> > > > +  uint32_t a;
> > > > +  uint8_t b;
> > > > +} s = { 0x123456, 0x78 };
> > > > +
> > > > +int pr67781()
> > > > +{
> > > > +  uint32_t c = (s.a << 8) | s.b;
> > > > +  return c;
> > > > +}
> > > > +
> > > > +int
> > > > +main ()
> > > > +{
> > > > +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> > > > +    return 0;
> > > > +
> > > > +  if (pr67781 () != 0x12345678)
> > > > +    __builtin_abort ();
> > > > +  return 0;
> > > > +}
> > > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> > > > index b00f046..e5a185f 100644
> > > > --- a/gcc/tree-ssa-math-opts.c
> > > > +++ b/gcc/tree-ssa-math-opts.c
> > > > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct
> > > > symbolic_number *n, int limit)
> > > > 
> > > >  static gimple *
> > > >  find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool
> > > >  *bswap)
> > > >  {
> > > > 
> > > > +  unsigned rsize;
> > > > +  uint64_t tmpn, mask;
> > > > 
> > > >  /* The number which the find_bswap_or_nop_1 result should match in
> > > >  order
> > > >  
> > > >     to have a full byte swap.  The number is shifted to the right
> > > >     according to the size of the symbolic number before using it.  */
> > > > 
> > > > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct
> > > > symbolic_number *n, bool *bswap)
> > > > 
> > > >    /* Find real size of result (highest non-zero byte).  */
> > > >    if (n->base_addr)
> > > > 
> > > > -    {
> > > > -      int rsize;
> > > > -      uint64_t tmpn;
> > > > -
> > > > -      for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> > > > rsize++); -      n->range = rsize;
> > > > -    }
> > > > +    for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> > > > rsize++);
> > > > +  else
> > > > +    rsize = n->range;
> > > > 
> > > > -  /* Zero out the extra bits of N and CMP*.  */
> > > > +  /* Zero out the bits corresponding to untouched bytes in original
> > > > gimple
> > > > +     expression.  */
> > > > 
> > > >    if (n->range < (int) sizeof (int64_t))
> > > >    
> > > >      {
> > > > 
> > > > -      uint64_t mask;
> > > > -
> > > > 
> > > >        mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1;
> > > >        cmpxchg >>= (64 / BITS_PER_MARKER - n->range) *
> > > >        BITS_PER_MARKER;
> > > >        cmpnop &= mask;
> > > >      
> > > >      }
> > > > 
> > > > +  /* Zero out the bits corresponding to unused bytes in the result of
> > > > the
> > > > +     gimple expression.  */
> > > > +  if (rsize < n->range)
> > > > +    {
> > > > +      if (BYTES_BIG_ENDIAN)
> > > > +       {
> > > > +         mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
> > > > +         cmpxchg &= mask;
> > > > +         cmpnop >>= (n->range - rsize) * BITS_PER_MARKER;
> > > > +       }
> > > > +      else
> > > > +       {
> > > > +         mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
> > > > +         cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER;
> > > > +         cmpnop &= mask;
> > > > +       }
> > > > +      n->range = rsize;
> > > > +    }
> > > > +
> > > > 
> > > >    /* A complete byte swap should make the symbolic number to start
> > > >    with
> > > >    
> > > >       the largest digit in the highest order byte. Unchanged symbolic
> > > >       number indicates a read with same endianness as target
> > > >       architecture.
> > > >       
> > > >        */
> > > > 
> > > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu
> > > > GCC
> > > > and on an arm-none-eabi GCC cross-compiler without any regression. I'm
> > > > waiting for a slot on gcc110 to do a big endian bootstrap but at least
> > > > the testcase works on mips-linux. I'll send an update once bootstrap
> > > > is
> > > > complete.
> > > > 
> > > > Is this ok for trunk and 5 branch in a week time if no regression is
> > > > reported?
> > > 
> > > Yes.
> > 
> > What about a backport to GCC 5? The patch applies cleanly and shows no
> > regression when running the testsuite with an arm-none-eabi GCC
> > cross-compiler and a natively bootstrapped x86_64-linux-gnu GCC compiler.
> > 
> > It also pass the test added in the patch with a nativly bootstrapped
> > powerpc64-unknown-linux-gnu GCC compiler (on gcc110 in compile farm).
> 
> The "Yes" above approved a backport.

Doh indeed. My apologize.

Best regards,

Thomas

Reply via email to