On Fri, Aug 16, 2013 at 02:15:15AM -0700, Alexander Strange wrote:
> On Thu, Aug 15, 2013 at 3:29 PM, Diego Biurrun <[email protected]> wrote:
> >
> > Older versions of clang choke if that function is forcibly inlined.
> > Furthermore, inlining the function gives no performance benefit at
> > least with gcc 4.4 and 4.6.
> 
> What does clang do?

With the patch applied, nm tells me the following:

~/tmp/build/clang$ nm libavcodec/h264_cabac.o
         U av_log
0000228e r b_mb_type_info
00002246 r b_sub_mb_type_info
00000000 r cabac_context_init_I
00000800 r cabac_context_init_PB
0000209b r chroma422_dc_scan
000020a3 r chroma_dc_scan
00007e30 t decode_cabac_intra_mb_type
000091a0 t decode_cabac_mb_mvd
00007a30 t decode_cabac_mb_skip
000094f0 t decode_cabac_residual_internal
00000000 r decode_cabac_residual_internal.coeff_abs_level1_ctx
00002188 r decode_cabac_residual_internal.coeff_abs_level_m1_offset
00000010 r decode_cabac_residual_internal.coeff_abs_level_transition
00000000 r decode_cabac_residual_internal.coeff_abs_levelgt1_ctx
00002118 r decode_cabac_residual_internal.last_coeff_flag_offset
0000223e r decode_cabac_residual_internal.sig_coeff_offset_dc
000020a8 r decode_cabac_residual_internal.significant_coeff_flag_offset
000021c0 r decode_cabac_residual_internal.significant_coeff_flag_offset_8x8
         U ff_h264_cabac_tables
         U ff_h264_check_intra4x4_pred_mode
         U ff_h264_check_intra_pred_mode
00000090 T ff_h264_decode_mb_cabac
00000000 T ff_h264_init_cabac_states
         U ff_h264_mb_sizes
         U ff_h264_pred_direct_motion
         U ff_init_cabac_decoder
00007f90 t fill_decode_caches
00007b70 t fill_decode_neighbors
000022ea r fill_decode_neighbors.left_block_options
00007d90 t get_cabac_noinline
00002000 r i_mb_type_info
0000227a r p_mb_type_info
00000020 r p_sub_mb_type_info
00000000 r pred_pskip_motion.zeromv
00002068 r scan8

> I agree performance measurements are important for system compilers,
> but otherwise would prefer to ignore old versions.

I think you have a point there.

> That function should absolutely be always_inline though - its last two
> parameters are intended to be optimized out once it's inlined into
> dc_internal/nondc_internal/dc_internal_422. Not inlining it will just
> result in some very useless if tests.
> 
> But I guess the compiler could be defeating that intention by inlining
> in the wrong order - it does seem to be since the nm output isn't what
> I expect.
> 
> Could you try making these noinline:
> 
> decode_cabac_residual_dc_internal
> decode_cabac_residual_dc_internal_422
> decode_cabac_residual_nondc_internal

With these three marked as av_noinline and an av_always_inline attribute
on decode_cabac_residual_internal(), clang has no problems compiling the
code and does the following:

tmp@silver:~/tmp/build/clang$ nm libavcodec/h264_cabac.o
         U av_log
0000228e r b_mb_type_info
00002246 r b_sub_mb_type_info
00000000 r cabac_context_init_I
00000800 r cabac_context_init_PB
0000209b r chroma422_dc_scan
000020a3 r chroma_dc_scan
00007ab0 t decode_cabac_intra_mb_type
00008ec0 t decode_cabac_mb_mvd
000076b0 t decode_cabac_mb_skip
00009210 t decode_cabac_residual_dc_internal
0000a870 t decode_cabac_residual_dc_internal_422
00000000 r decode_cabac_residual_internal.coeff_abs_level1_ctx
00002188 r decode_cabac_residual_internal.coeff_abs_level_m1_offset
00000010 r decode_cabac_residual_internal.coeff_abs_level_transition
00000000 r decode_cabac_residual_internal.coeff_abs_levelgt1_ctx
00002118 r decode_cabac_residual_internal.last_coeff_flag_offset
0000223e r decode_cabac_residual_internal.sig_coeff_offset_dc
000020a8 r decode_cabac_residual_internal.significant_coeff_flag_offset
000021c0 r decode_cabac_residual_internal.significant_coeff_flag_offset_8x8
00009c40 t decode_cabac_residual_nondc_internal
         U ff_h264_cabac_tables
         U ff_h264_check_intra4x4_pred_mode
         U ff_h264_check_intra_pred_mode
00000090 T ff_h264_decode_mb_cabac
00000000 T ff_h264_init_cabac_states
         U ff_h264_mb_sizes
         U ff_h264_pred_direct_motion
         U ff_init_cabac_decoder
00007c10 t fill_decode_caches
000077f0 t fill_decode_neighbors
000022ea r fill_decode_neighbors.left_block_options
00007a10 t get_cabac_noinline
00002000 r i_mb_type_info
0000227a r p_mb_type_info
00000020 r p_sub_mb_type_info
00000000 r pred_pskip_motion.zeromv
00002068 r scan8

So indeed that appears to achieve what is intended, i.e. inlining
decode_cabac_residual_internal().

gcc 4.7 does the following:

tmp@silver:~/tmp/build/gcc-4.7$ nm libavcodec/h264_cabac.o
         U av_log
00002000 r b_mb_type_info
00002140 r b_sub_mb_type_info
00001800 r cabac_context_init_I
00000000 r cabac_context_init_PB
00002184 r chroma422_dc_scan
0000218c r chroma_dc_scan
00002310 r coeff_abs_level1_ctx.8622
00002340 r coeff_abs_level_m1_offset.8619
00002318 r coeff_abs_level_transition.8624
00002328 r coeff_abs_levelgt1_ctx.8623
00001970 t decode_cabac_intra_mb_type
00002ce0 t decode_cabac_mb_mvd
00001af0 t decode_cabac_mb_skip
00001ce0 t decode_cabac_residual_dc_internal
00002f50 t decode_cabac_residual_dc_internal_422.constprop.1
000023c0 t decode_cabac_residual_nondc_internal
000017e0 t decode_significance_x86
         U ff_h264_cabac_tables
         U ff_h264_check_intra4x4_pred_mode
         U ff_h264_check_intra_pred_mode
00003830 T ff_h264_decode_mb_cabac
000037a0 T ff_h264_init_cabac_states
         U ff_h264_mb_sizes
         U ff_h264_pred_direct_motion
         U ff_init_cabac_decoder
000003d0 t fill_decode_caches
000001b0 t fill_decode_neighbors
000000f0 t get_cabac
00000030 t get_cabac_noinline
00002080 r i_mb_type_info
000022a0 r last_coeff_flag_offset.8618
000021a0 r left_block_options.8238
00000000 t mid_pred
0000205c r p_mb_type_info
00002174 r p_sub_mb_type_info
00002100 r scan8
00002220 r significant_coeff_flag_offset.8617
00002380 r significant_coeff_flag_offset_8x8.8620
00002190 r zeromv.8214

I have no idea what ".constprop.1" is supposed to be, but it seems to
be the desired outcome.

Diego
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to