Re: [FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

2014-11-07 Thread Christophe Gisquet
Hi, 2014-11-06 23:04 GMT+01:00 James Almer jamr...@gmail.com: No, the function checks for alignment and jumps to a branch that uses movdqu if needed. ff_int32_to_float_a_avx also uses ymm regs and this same macro. OK, so nothing new here, same 32-bytes alignment. when mulps m0, m1, [mem]

Re: [FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

2014-11-07 Thread James Almer
On 07/11/14 6:05 AM, Christophe Gisquet wrote: Hi, 2014-11-06 23:04 GMT+01:00 James Almer jamr...@gmail.com: No, the function checks for alignment and jumps to a branch that uses movdqu if needed. ff_int32_to_float_a_avx also uses ymm regs and this same macro. OK, so nothing new here,

[FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

2014-11-06 Thread James Almer
13797 decicycles in ff_float_to_int32_a_sse2, 32768 runs, 0 skips 8603 decicycles in ff_float_to_int32_a_avx2, 32766 runs, 2 skips Signed-off-by: James Almer jamr...@gmail.com --- x86inc.asm doesn't seem to handle cmpps or its aliases properly when using avx. libswresample/x86/audio_convert.asm

Re: [FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

2014-11-06 Thread Christophe Gisquet
Hi, 2014-11-06 21:48 GMT+01:00 James Almer jamr...@gmail.com: 13797 decicycles in ff_float_to_int32_a_sse2, 32768 runs, 0 skips 8603 decicycles in ff_float_to_int32_a_avx2, 32766 runs, 2 skips A couple of naïve questions (I haven't checked): Does it increase the alignment requirement? If yes,

Re: [FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

2014-11-06 Thread James Almer
On 06/11/14 6:35 PM, Christophe Gisquet wrote: Hi, 2014-11-06 21:48 GMT+01:00 James Almer jamr...@gmail.com: 13797 decicycles in ff_float_to_int32_a_sse2, 32768 runs, 0 skips 8603 decicycles in ff_float_to_int32_a_avx2, 32766 runs, 2 skips A couple of naïve questions (I haven't checked):