Most of the slowdown is due the fact the function to read mantissas is a branch-nest so even adding one more makes the compilers try something not so great.
I added some more local contexts (that work quite nicely as compiler hints) and introduced an extra simple prefetch+read_cache to see if that's the problem and the slowdown had been reduced a lot. Small speedup [PATCH 01/10] ac3: Use local contexts This is mostly to keep my sanity, shouldn't have a bit effect and some compilers flip the coin and speed it up / slow it down. [PATCH 02/10] ac3: Split spx-specific code from decode_audio_block [PATCH 03/10] ac3: Split coupling-specific code from [PATCH 04/10] ac3: Add some inline hints While at it. [PATCH 05/10] ac3: Simplify skipping It could be made smarter, works well enough for the purpose [PATCH 06/10] bitstream: add prefetch() and read_cache() With the following the slowdown is contained [PATCH 07/10] ac3: use prefetch on ac3_parse_header [PATCH 08/10] ac3: Use prefetch in the decode_audio_block [PATCH 09/10] ac3: prefetch in avpriv_ac3_parse_header [PATCH 10/10] ac3: prefetch decode_transform_coeffs Set sent since Vittorio wants to benchmark more =P lu _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
