There are 3 main modifications that have impact beyond x86: - int8x8_fmul_int32 can either be an inline function or called through the dsp context; - lfe_dir cases have been split because it is overly complex to optimize otherwise; - synth filter float is rewritten so that the imdct calls are done in C, as function calls are overly complex using x86 asm
All patches pass fate-dts. Christophe Gisquet (11): dcadsp: add int8x8_fmul_int32 to dsp context x86: dcadsp: implement int8x8_fmul_int32 dcadsp: split lfe_dir cases x86: dcadsp: implement SSE lfe_dir dca dsp C: factorize scaling in lfe_fir dcadsp: scan linearly coeffs instead. x86: synth filter float: implement SSE2 version dcadsp: split synth_filter_float dca: replace some memcpy by AV_COPY128 dca: factorize scaling in inverse ADPCM x86: float dsp: unroll SSE versions libavcodec/arm/dca.h | 5 +- libavcodec/arm/dcadsp_init_arm.c | 33 ++++++- libavcodec/dcadec.c | 45 ++++----- libavcodec/dcadsp.c | 56 ++++++++--- libavcodec/dcadsp.h | 9 +- libavcodec/synth_filter.c | 14 +-- libavcodec/synth_filter.h | 9 +- libavcodec/x86/Makefile | 3 + libavcodec/x86/dca.h | 56 +++++++++++ libavcodec/x86/dcadsp.asm | 195 +++++++++++++++++++++++++++++++++++++++ libavcodec/x86/dcadsp_init.c | 52 +++++++++++ libavcodec/x86/fft_init.c | 17 ++++ libavcodec/x86/synth_filter.asm | 178 +++++++++++++++++++++++++++++++++++ libavutil/x86/float_dsp.asm | 40 ++++---- 14 files changed, 635 insertions(+), 77 deletions(-) create mode 100644 libavcodec/x86/dca.h create mode 100644 libavcodec/x86/dcadsp.asm create mode 100644 libavcodec/x86/dcadsp_init.c create mode 100644 libavcodec/x86/synth_filter.asm -- 1.8.0.msysgit.0 _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
