Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-08-19 Thread Michael Niedermayer
On Mon, Aug 18, 2014 at 03:28:02PM -0300, James Almer wrote: On 18/08/14 5:01 AM, Pierre Edouard Lepere wrote: Hi, here's the new version of the patch. Sorry for the delay. James, I have not done 8-bit AVX versions because it requires unpacks that are done differently in AVX. Aren't

Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-08-18 Thread Pierre Edouard Lepere
Hi, here's the new version of the patch. Sorry for the delay. James, I have not done 8-bit AVX versions because it requires unpacks that are done differently in AVX. Thanks for the feedback ! -Pierre-Edouard Leperecommit 414ebcfeb47ea99ac7e8281d2794996d8a2a09fc Author: plepere

Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-08-18 Thread James Almer
On 18/08/14 5:01 AM, Pierre Edouard Lepere wrote: Hi, here's the new version of the patch. Sorry for the delay. James, I have not done 8-bit AVX versions because it requires unpacks that are done differently in AVX. Aren't you thinking of AVX2 with 256bits wide registers? With AVX i mean an

[FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-07-30 Thread Pierre Edouard Lepere
Hi, Here's a patch adding ASM transform_add functions for HEVC. Regards, Pierre-Edouard Leperecommit 1db36e2f5bae3a34d1a5db4520234b52afb51bbb Author: plepere pierre-edouard.lep...@insa-rennes.fr Date: Wed Jul 30 10:31:49 2014 +0200 adding ASM transform_add functions for HEVC diff --git

Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-07-30 Thread Ronald S. Bultje
Hi! On Wed, Jul 30, 2014 at 9:33 AM, Pierre Edouard Lepere pierre-edouard.lep...@insa-rennes.fr wrote: Here's a patch adding ASM transform_add functions for HEVC. Yay! I'll try to review soon. Do you have rough performance metrics? I know it's faster :-p but it's nice to document by how

Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-07-30 Thread James Almer
On 30/07/14 10:33 AM, Pierre Edouard Lepere wrote: +%macro TR_ADD_INIT_SSE_8 2 +movu m4, [r1] +movu m6, [r1+16] +movu m8, [r1+32] +movu m10, [r1+48] You can use mova here, and probably in every other movu as well. +lea

Re: [FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

2014-07-30 Thread Ronald S. Bultje
Hi, On Wed, Jul 30, 2014 at 5:04 PM, James Almer jamr...@gmail.com wrote: On 30/07/14 10:33 AM, Pierre Edouard Lepere wrote: +%macro TR_ADD_INIT_SSE_8 2 +movu m4, [r1] +movu m6, [r1+16] +movu m8, [r1+32] +movu m10,