Mans Rullgard <[email protected]> writes:

> Currently, --enable-small turns av_always_inline into plain inline,
> which is more or less ignored by the compiler.  While the intent of
> this is probably to reduce code size by avoiding some inlining, it
> has more far-reaching effects.
>
> We use av_always_inline in two situations:
>
> 1. The body of a function is smaller than the call overhead.
>    Instances of these are abundant in libavutil, the bswap.h
>    functions being good examples.
>
> 2. The function is a template relying on constant propagation
>    through inlined calls for sane code generation.  These are
>    often found in motion compensation code.
>
> Both of these types of functions should be inlined even if targeting
> small code size.
>
> Although GCC has heuristics for detecting the first of these types,
> it is not always reliable, especially when the function uses inline
> assembler, which is often the reason for having those functions in
> the first place, so making it explicit is generally a good idea.
>
> The size increase from inlining template-type functions is usually
> much smaller than it seems due to different branches being mutually
> exclusive between the different invocations.  The dead branches can,
> however, only be removed after inlining and constant propagation have
> been performed, which means the initial cost estimate for inlining
> these is much higher than is actually the case, resulting in GCC
> often making bad choices if left to its own devices.
>
> Furthermore, the GCC inliner limits how much it allows a function to
> grow due to automatic inlining of calls, and this appears to not take
> call overhead into account.  When nested inlining is used, the limit
> may be hit before the innermost level is reached.  In some cases, this
> has prevented inlining of type 1 functions as defined above, resulting
> in significant performance loss.
>
> Signed-off-by: Mans Rullgard <[email protected]>
> ---
>  configure |    9 ---------
>  1 file changed, 9 deletions(-)

On ARM, this change increases the code size of an --enable-small build
by about 370k to 4.5M, half of that in h264.o.  A normal build has a
code size of 6.9M.  Speedwise, this patch improves performance by up to
10%, H.264 benefitting particularly well.

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to