Hi, Andriy

----- Original Message -----
> From: "Andriy Gelman" <andriy.gel...@gmail.com>
> To: "FFmpeg development discussions and patches" <ffmpeg-devel@ffmpeg.org>
> Cc: xuju...@sjtu.edu.cn
> Sent: Monday, December 23, 2019 12:50:48 AM
> Subject: Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/vf_convolution: Add x86 
> SIMD optimizations for filter_row()

> Xu,
> 
> On Sun, 22. Dec 16:37, xuju...@sjtu.edu.cn wrote:
>> From: Xu Jun <xuju...@sjtu.edu.cn>
>> 
>> Read 16 elements from memory, shuffle and parallally compute 4 rows at a 
>> time,
>> shuffle and parallelly write 16 results to memory.
>> Performance improves about 15% compared to v1.
>> 
>> Tested using this command:
>> ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 
>> 5 6
>> 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8
>> 9:1/45:1/45:1/45:1/45:1:2:3:4:row:row:row:row" -an -vframes 5000 -f null
>> /dev/null -benchmark
>> 
>> after patch:
>> frame= 4317 fps=622 q=-0.0 Lsize=N/A time=00:02:52.68 bitrate=N/A speed=24.9x
>> video:2260kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB 
>> muxing
>> overhead: unknown
>> bench: utime=20.539s stime=1.834s rtime=6.943s
>> 
>> before patch(c version):
>> frame= 4317 fps=306 q=-0.0 Lsize=N/A time=00:02:52.68 bitrate=N/A speed=12.2x
>> video:2260kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB 
>> muxing
>> overhead: unknown
>> bench: utime=60.591s stime=1.787s rtime=14.100s
>> 
>> Signed-off-by: Xu Jun <xuju...@sjtu.edu.cn>
>> ---
>>  libavfilter/x86/vf_convolution.asm    | 131 ++++++++++++++++++++++++++
>>  libavfilter/x86/vf_convolution_init.c |   9 ++
>>  2 files changed, 140 insertions(+)
>>  mode change 100644 => 100755 libavfilter/x86/vf_convolution.asm
>> 
>> diff --git a/libavfilter/x86/vf_convolution.asm
>> b/libavfilter/x86/vf_convolution.asm
>> old mode 100644
>> new mode 100755
>> index 754d4d1064..2a09374b00
>> --- a/libavfilter/x86/vf_convolution.asm
>> +++ b/libavfilter/x86/vf_convolution.asm
>> @@ -154,3 +154,134 @@ cglobal filter_3x3, 4, 15, 7, dst, width, rdiv, bias,
>> matrix, ptr, c0, c1, c2, c
>>  INIT_XMM sse4
>>  FILTER_3X3
>>  %endif
>> +
> 
> Patch 2-3 are failing to build:
> https://unofficial.patchwork-ffmpeg.org/project/FFmpeg/list/?series=26
> 
> --
> Andriy

I'm sorry I haven't built patches independently. There seem to be some bugs in 
the dependency of the patches.
I'll fix them in v3.

Xu Jun  
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to