Mans Rullgard <[email protected]> writes: > Currently, --enable-small turns av_always_inline into plain inline, > which is more or less ignored by the compiler. While the intent of > this is probably to reduce code size by avoiding some inlining, it > has more far-reaching effects. > > We use av_always_inline in two situations: > > 1. The body of a function is smaller than the call overhead. > Instances of these are abundant in libavutil, the bswap.h > functions being good examples. > > 2. The function is a template relying on constant propagation > through inlined calls for sane code generation. These are > often found in motion compensation code. > > Both of these types of functions should be inlined even if targeting > small code size. > > Although GCC has heuristics for detecting the first of these types, > it is not always reliable, especially when the function uses inline > assembler, which is often the reason for having those functions in > the first place, so making it explicit is generally a good idea. > > The size increase from inlining template-type functions is usually > much smaller than it seems due to different branches being mutually > exclusive between the different invocations. The dead branches can, > however, only be removed after inlining and constant propagation have > been performed, which means the initial cost estimate for inlining > these is much higher than is actually the case, resulting in GCC > often making bad choices if left to its own devices. > > Furthermore, the GCC inliner limits how much it allows a function to > grow due to automatic inlining of calls, and this appears to not take > call overhead into account. When nested inlining is used, the limit > may be hit before the innermost level is reached. In some cases, this > has prevented inlining of type 1 functions as defined above, resulting > in significant performance loss. > > Signed-off-by: Mans Rullgard <[email protected]> > --- > configure | 9 --------- > 1 file changed, 9 deletions(-)
On ARM, this change increases the code size of an --enable-small build by about 370k to 4.5M, half of that in h264.o. A normal build has a code size of 6.9M. Speedwise, this patch improves performance by up to 10%, H.264 benefitting particularly well. -- Måns Rullgård [email protected] _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
