Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread Reimar Döffinger
On 21.08.2014, at 22:40, Kieran Kunhya wrote: >> It does that, but on older SSE2 cpus with not-so-good OOO execution grouping >> instructions like this might help reduce dependencies a bit. > > Are any older SSE2 CPUs actually capable of decoding reasonable HEVC? Of course they are. Not in real

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread James Almer
On 21/08/14 5:40 PM, Kieran Kunhya wrote: >> It does that, but on older SSE2 cpus with not-so-good OOO execution grouping >> instructions like this might help reduce dependencies a bit. > > Are any older SSE2 CPUs actually capable of decoding reasonable HEVC? Probably not (at least nothing above

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread Kieran Kunhya
> It does that, but on older SSE2 cpus with not-so-good OOO execution grouping > instructions like this might help reduce dependencies a bit. Are any older SSE2 CPUs actually capable of decoding reasonable HEVC? ___ ffmpeg-devel mailing list ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread James Almer
On 21/08/14 2:15 PM, Christophe Gisquet wrote: > Hi, > > 2014-08-21 0:42 GMT+02:00 James Almer : >> * Reduced xmm register count to 7 (As such they are now enabled for x86_32). >> * Removed four movdqa (affects the sse2 version only). >> * pxor is now used to clear m0 only once. > > OK. Pushed.

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread Christophe Gisquet
Hi, 2014-08-21 0:42 GMT+02:00 James Almer : > * Reduced xmm register count to 7 (As such they are now enabled for x86_32). > * Removed four movdqa (affects the sse2 version only). > * pxor is now used to clear m0 only once. OK. -- Christophe ___ ffmpe

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread James Almer
On 21/08/14 10:03 AM, Hendrik Leppkes wrote: > On Thu, Aug 21, 2014 at 12:42 AM, James Almer wrote: >> * Reduced xmm register count to 7 (As such they are now enabled for x86_32). >> * Removed four movdqa (affects the sse2 version only). >> * pxor is now used to clear m0 only once. >> >> ~5% faste

Re: [FFmpeg-devel] [PATCH] x86/hevc_res_add: refactor ff_hevc_transform_add{16, 32}_8

2014-08-21 Thread Hendrik Leppkes
On Thu, Aug 21, 2014 at 12:42 AM, James Almer wrote: > * Reduced xmm register count to 7 (As such they are now enabled for x86_32). > * Removed four movdqa (affects the sse2 version only). > * pxor is now used to clear m0 only once. > > ~5% faster. > > Signed-off-by: James Almer > --- Good job,