Slightly increasing the complexity of a function can disproportionately increase the size and runtime of the generated code. This appears to be due to the optimisers giving up on code blocks above a certain abstract size, and is particularly severe on PPC and ARM, but is observable on ia32 and amd64 as well.
This is a general problem which affects any large function, and has done since at least gcc3 days - I first encountered it when trying to use Altivec intrinsics. In some cases manually moving a function call *out* of a loop results in 4x the runtime, which is the opposite of normal expectations. Attached is an example which demonstrates poor code generated after a long series of inlining and dead code elimination stages. Demonstration is on PPC32, but the same example suffices for ARMv7-A as well. An amd64 target produces reasonable code for this example, but a fairly small complexity increase causes a similar collapse. The output is two functions, one generated from the tower of inlining, and the other (with a manual_ prefix) after the same optimisations were performed manually. The quality of the latter is clearly better than the former, which contains the following sequence in the inner loop: fmuls 0,12,0 stw 4,32(1) stw 3,16(1) stw 3,20(1) stfs 0,8(1) lwz 4,8(1) stw 4,24(1) li 4,0 stw 4,36(1) lfs 0,24(1) fmadds 0,9,0,10 All of the stw's in the above fragment are dead, except the "stw 4,24(1)" which merely shuffles the value from f0 through two memory locations and back to f0. The "li 4,0" also demonstrates very poor register allocation, since r4 already contains zero before this fragment. In the "manual" variant, the fmuls is immediately followed by the fmadds. The same source file run through Clang on amd64 produces virtually identical output for the two versions. -- Summary: Optimisations fail above arbitrary level of complexity Product: gcc Version: 4.4.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jonathan dot morton at movial dot com GCC build triplet: powerpc-linux-gnu GCC host triplet: powerpc-linux-gnu GCC target triplet: powerpc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45498