http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52285

             Bug #: 52285
           Summary: [4.7 Regression] libgcrypt _gcry_burn_stack slowdown
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: ja...@gcc.gnu.org


#define wipememory2(_ptr,_set,_len) do { \
              volatile char *_vptr=(volatile char *)(_ptr); \
              unsigned long _vlen=(_len); \
              while(_vlen) { *_vptr=(_set); _vptr++; _vlen--; } \
                  } while(0)
#define wipememory(_ptr,_len) wipememory2(_ptr,0,_len)

__attribute__((noinline, noclone)) void
_gcry_burn_stack (int bytes)
{
  char buf[64];

    wipememory (buf, sizeof buf);
    bytes -= sizeof buf;
    if (bytes > 0)
        _gcry_burn_stack (bytes);
}

is one of the hot parts of gcrypt camellia256 ECB benchmark which apparently
slowed down in 4.7 compared to 4.6.  The routine is called many times, usually
with bytes equal to 372.

The above first slowed down the whole benchmark from ~ 2040ms to ~ 2400ms (-O2
-march=corei7-avx -fPIC) in
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178104
and then (somewhat surprisingly) became tiny bit faster again with
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181172 , which no longer tail
recursion optimizes the function (in this case suprisingly a win, but generally
IMHO a mistake).

Reply via email to