https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53659
PeteVine <tulipawn at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tulipawn at gmail dot com --- Comment #2 from PeteVine <tulipawn at gmail dot com> --- Even though I've tested this on a Cortex-A5, the 18% difference does reproduce on gcc 6.1.1 (2694 vs 3304 ms): First, the slower profile for A9 codegen: CPU: ARM Cortex-A5, speed 1.728e+06 MHz (estimated) Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 900000 samples % linenr info image name symbol name 14460 65.7422 c-ray-mt.c:377 c-ray-mt shade 3901 17.7358 c-ray-mt.c:336 c-ray-mt trace 3181 14.4624 c-ray-mt.c:308 c-ray-mt render_scanline 186 0.8456 e_pow.c:70 libm-2.19.so __pow_finite 68 0.3092 e_exp.c:240 libm-2.19.so __exp1 55 0.2501 (no location information) no-vmlinux /no-vmlinux 40 0.1819 c-ray-mt.c:454 c-ray-mt get_primary_ray 38 0.1728 c-ray-mt.c:497 c-ray-mt get_sample_pos 17 0.0773 fraiseexcpt.c:27 libm-2.19.so feraiseexcept 13 0.0591 e_pow.c:430 libm-2.19.so checkint 6 0.0273 fesetround.c:31 libm-2.19.so fesetround 5 0.0227 fputc.c:37 libc-2.19.so fputc 5 0.0227 feupdateenv.c:27 libm-2.19.so feupdateenv@@GLIBC_2.4 4 0.0182 feholdexcpt.c:32 libm-2.19.so feholdexcept 4 0.0182 fesetenv.c:31 libm-2.19.so fesetenv@@GLIBC_2.4 3 0.0136 mpa.c:767 libm-2.19.so __sqr 2 0.0091 strtod_l.c:483 libc-2.19.so ____strtod_l_internal 1 0.0045 c-ray-mt.c:170 c-ray-mt main 1 0.0045 dl-tls.c:770 ld-2.19.so __tls_get_addr 1 0.0045 dl-reloc.c:154 ld-2.19.so _dl_relocate_object 1 0.0045 (no location information) libc-2.19.so .udivsi3_skip_div0_test 1 0.0045 malloc.c:3302 libc-2.19.so _int_malloc 1 0.0045 random_r.c:366 libc-2.19.so random_r 1 0.0045 strtod_l.c:201 libc-2.19.so round_and_return compared to the default codegen: samples % linenr info image name symbol name 11657 64.6211 c-ray-mt.c:377 c-ray-mt shade 3396 18.8259 c-ray-mt.c:336 c-ray-mt trace 2586 14.3356 c-ray-mt.c:308 c-ray-mt render_scanline 172 0.9535 e_pow.c:70 libm-2.19.so __pow_finite 49 0.2716 (no location information) no-vmlinux /no-vmlinux 47 0.2605 e_exp.c:240 libm-2.19.so __exp1 41 0.2273 c-ray-mt.c:454 c-ray-mt get_primary_ray 39 0.2162 c-ray-mt.c:497 c-ray-mt get_sample_pos 16 0.0887 e_pow.c:430 libm-2.19.so checkint 12 0.0665 fraiseexcpt.c:27 libm-2.19.so feraiseexcept 7 0.0388 fputc.c:37 libc-2.19.so fputc 2 0.0111 c-ray-mt.c:170 c-ray-mt main 2 0.0111 strtod_l.c:483 libc-2.19.so ____strtod_l_internal 2 0.0111 mpa.c:767 libm-2.19.so __sqr 2 0.0111 feholdexcpt.c:32 libm-2.19.so feholdexcept 2 0.0111 fesetround.c:31 libm-2.19.so fesetround 1 0.0055 cxa_thread_atexit_impl.c:83 libc-2.19.so __call_tls_dtors 1 0.0055 memchr.S:58 libc-2.19.so memchr 1 0.0055 random_r.c:366 libc-2.19.so random_r 1 0.0055 strtok.c:38 libc-2.19.so strtok 1 0.0055 mpa.c:614 libm-2.19.so __mul 1 0.0055 fesetenv.c:31 libm-2.19.so fesetenv@@GLIBC_2.4 1 0.0055 feupdateenv.c:27 libm-2.19.so feupdateenv@@GLIBC_2.4