https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122041
--- Comment #6 from Petr Sumbera <sumbera at volny dot cz> --- Interesting! By programmatically avoiding memcpy, GCC is now faster than Studio. But even Studio now produce little bit faster code. gdiff -u crc.c.orig crc.c --- crc.c +++ crc.c @@ -41,7 +41,7 @@ crc32_update_no_xor_slice_by_8 (uint32_t crc, const char *buf) { uint64_t local_buf; - memcpy (&local_buf, buf, 8); + local_buf = *(const uint64_t *)buf; local_buf = le64toh (local_buf) ^ crc; crc = crc32_sliceby8_table[0][(local_buf >> 56) & 0xFF] ^ crc32_sliceby8_table[1][(local_buf >> 48) & 0xFF] uls-0 14:25 /builds/psumbera/userland-gzip-sparc-gcc/components/gzip/TMP/test: gmake test gcc -o test.o -c test.c gcc -O3 -funroll-loops -mcpu=niagara4 -mtune=niagara4 -o crc-gcc.o -c crc.c gcc -o test-gcc test.o crc-gcc.o /opt/developerstudio12.6/bin/cc -m64 -xO4 -xtarget=generic -xarch=sparcvis -xchip=generic -xregs=no%appl -xmemalign=16s -o crc-studio.o -c crc.c gcc -o test-studio test.o crc-studio.o time ./test-gcc real 10.8 user 10.7 sys 0.0 time ./test-studio real 11.4 user 11.3 sys 0.0