[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-03 Thread James Almer
Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer jamr...@gmail.com --- Now also with 16 bits indep4 and indep6. libavcodec/arm/flacdsp_init_arm.c | 2 +- libavcodec/flacdec.c | 6 +- libavcodec/flacdsp.c

[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-02 Thread James Almer
Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer jamr...@gmail.com --- TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 channels. AVX2 and maybe MMX versions. Planar? libavcodec/arm/flacdsp_init_arm.c

Re: [FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-02 Thread Clément Bœsch
On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote: Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer jamr...@gmail.com --- TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 channels. AVX2 and

Re: [FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-02 Thread James Almer
On 02/11/14 7:43 PM, Clément Bœsch wrote: On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote: Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer jamr...@gmail.com --- TODO: 16 bits indep for 4, 6 and 8 channels. 24/32

Re: [FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-02 Thread Clément Bœsch
On Sun, Nov 02, 2014 at 07:55:35PM -0300, James Almer wrote: On 02/11/14 7:43 PM, Clément Bœsch wrote: On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote: Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer

[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

2014-11-02 Thread James Almer
Two to four times faster depending on instruction set, block size and channel count. Signed-off-by: James Almer jamr...@gmail.com --- libavcodec/arm/flacdsp_init_arm.c | 2 +- libavcodec/flacdec.c | 6 +- libavcodec/flacdsp.c | 6 +- libavcodec/flacdsp.h