On 10/13/16 16:02, Alexandra Hájková wrote:
From: Pierre Edouard Lepere
Initially written by Pierre Edouard Lepere
,
extended by James Almer .
Signed-off-by: Alexandra Hájková
Hi,
On Oct 14, 2016 17:28, "Luca Barbato" wrote:
>
> On 14/10/2016 23:25, Vittorio Giovara wrote:
> > From: Michael Niedermayer
> >
> > Signed-off-by: Vittorio Giovara
> > ---
> > This should fix ppc and sun fate tests.
> >
This work is sponsored by, and copyright, Google.
The implementation tries to have smart handling of cases
where no pixels need the full filtering for the 8/16 width
filters, skipping both calculation and writeback of the
unmodified pixels in those cases. The actual effect of this
is hard to test
From: Michael Niedermayer
This avoids potential issues with the high 32bits being random in x86-64 asm
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavcodec/huffyuvdsp.h | 4 ++--
From: Christophe Gisquet
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
From: James Darnley
This is a fairly dumb copy of the assembly for 8-bit samples but it
works and produces identical output to the C version. The options have
been tested on an Athlon64 and a Core2Quad.
Athlon64:
1810385 decicycles in C,32726 runs, 42 skips
1080744
From: James Darnley
These smaller samples do not need to be unpacked to double words
allowing the code to process more pixels every iteration (still 2 in MMX
but 6 in SSE2). It also avoids emulating the missing double word
instructions on older instruction sets.
Like
From: James Darnley
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavfilter/x86/vf_yadif.asm | 28 ++--
libavfilter/x86/yadif-10.asm | 26 +-
From: James Almer
And use the x86util ones instead, which are optimized for mmxext/sse2.
About ~1% increase in performance on pre SSSE3 processors.
Signed-off-by: James Almer
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
From: Christophe Gisquet
C MMX SSE2
Cycles: 2972 587 302
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavcodec/huffyuvdsp.h | 2 +-
libavcodec/huffyuvdsp.c
From: James Darnley
The filter already checks that width (and height) are greater than 3.
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavfilter/x86/vf_yadif.asm | 2 --
libavfilter/x86/yadif-10.asm |
From: Christophe Gisquet
>From 5010c to 4566 on lagarith YUY2.
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavcodec/x86/huffyuvdsp_init.c | 4 ++
libavcodec/x86/huffyuvdsp.asm| 98
From: Christophe Gisquet
When there are 2 functions that are <= SSE2, only one is needed for x86_64.
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavcodec/x86/huffyuvdsp_init.c | 10 ++
Hi,
This series brings in the yadif high bit depth x86 asm from FFmpeg.
Since '[PATCH 11/12] x86util: add and use RSHIFT/LSHIFT macros' depends
on huffyuv asm changes and those looked self-contained I brought them
over as well. I tried to keep the changes to the original patches as
small as
From: Christophe Gisquet
Signed-off-by: Janne Grunau
---
tests/fate/filter-video.mak | 6 ++
tests/ref/fate/filter-yadif10 | 31 +++
tests/ref/fate/filter-yadif16 | 31 +++
3
From: Robert Krüger
Signed-off-by: Robert Krüger
Signed-off-by: Michael Niedermayer
Signed-off-by: Janne Grunau
---
libavfilter/x86/yadif-10.asm | 18 +-
libavfilter/x86/yadif-16.asm | 18
The include was changed correctly in 4abe3b049d987420eb891f74a35af2cebbf52144,
but then mistakenly changed back by c359d624d3efc3fd1d83210d78c4152bd329b765
(it's not just the NAL unit types which are used).
---
Building is currently broken with libva supporting H.265, this fixes it.
On 16/10/2016 20:10, Janne Grunau wrote:
> From: Robert Krüger
>
> Signed-off-by: Robert Krüger
> Signed-off-by: Michael Niedermayer
> Signed-off-by: Janne Grunau
> ---
> libavfilter/x86/yadif-10.asm | 18
On 16/10/2016 23:23, Martin Storsjö wrote:
> On Sun, 16 Oct 2016, Luca Barbato wrote:
>
>> On 16/10/2016 22:18, Martin Storsjö wrote:
>>>
>>> Now the comparison to libvpx is much more close; we're rarely slower
>>> at all, and even much faster in some cases.
>>
>> Probably you could update the
On 16/10/2016 20:10, Janne Grunau wrote:
> From: James Darnley
>
> Signed-off-by: Michael Niedermayer
> Signed-off-by: Janne Grunau
> ---
> libavfilter/x86/vf_yadif.asm | 28 ++--
>
On 16/10/2016 22:18, Martin Storsjö wrote:
>
> Now the comparison to libvpx is much more close; we're rarely slower
> at all, and even much faster in some cases.
Probably you could update the statement in the commit and push it then =)
lu
___
On 16/10/2016 20:10, Janne Grunau wrote:
> From: James Darnley
>
> The filter already checks that width (and height) are greater than 3.
>
> Signed-off-by: Michael Niedermayer
> Signed-off-by: Janne Grunau
> ---
>
On Sun, 16 Oct 2016, Luca Barbato wrote:
On 16/10/2016 22:18, Martin Storsjö wrote:
Now the comparison to libvpx is much more close; we're rarely slower
at all, and even much faster in some cases.
Probably you could update the statement in the commit and push it then =)
I've already
On 17/10/2016 01:03, Mark Thompson wrote:
> The include was changed correctly in 4abe3b049d987420eb891f74a35af2cebbf52144,
> but then mistakenly changed back by c359d624d3efc3fd1d83210d78c4152bd329b765
> (it's not just the NAL unit types which are used).
> ---
> Building is currently broken with
24 matches
Mail list logo