This is toy only, it depends on compiler
On my PC, it helpful my old version compiler generate movaps other than movups.
At 2019-12-02 17:21:58, "Carl Eugen Hoyos" wrote:
>Am Mo., 2. Dez. 2019 um 08:33 Uhr schrieb chen :
>
>> +#define __assume(cond) do { if (!(cond))
Am Mo., 2. Dez. 2019 um 08:33 Uhr schrieb chen :
> +#define __assume(cond) do { if (!(cond)) __builtin_unreachable(); }
> while (0)
We currently don't do that.
If you have a testcase where it makes a big difference,
adding it could be discussed but has to be checked in
configure and added
Am Mo., 2. Dez. 2019 um 03:42 Uhr schrieb 徐鋆 :
> I'm sorry not to reply in time.
Definitely in time!
> The performance of this C code is about 10% better than the existing C code.
Please add this to the commit message.
Carl Eugen
___
ffmpeg-devel
I have a little suggest on filter_column16(..) [the function]
Firstly, the function is confused with filter16_column(..)
Secondly, the function's algoritym based on row direction, it means reduced
address calculate operators and less cache performance, cost of them may more
than calculate
I have a little suggest on filter_column16(..) [the function]
Firstly, the function is confused with filter16_column(..)
Secondly, the function's algoritym based on row direction, it means reduced
address calculate operators and less cache performance, cost of them may more
than calculate
> -Original Message-
> From: ffmpeg-devel On Behalf Of
> xuju...@sjtu.edu.cn
> Sent: Wednesday, November 27, 2019 10:56 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: xuju...@sjtu.edu.cn
> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column
> operation for filter_column() to
Hi, Steven
- 原始邮件 -
发件人: "Steven Liu"
收件人: "FFmpeg development discussions and patches"
抄送: "Steven Liu"
发送时间: 星期一, 2019年 12 月 02日 上午 10:44:48
主题: Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column operation
for filter_column() to prepare for x86 SIMD.
> 在
> 在 2019年12月2日,10:42,徐鋆 写道:
>
> I'm sorry not to reply in time.
>
> The performance of this C code is about 10% better than the existing C code.
>
> It will have a bigger improvement after X86 SIMD optimizations.
1. How to test?
1. 怎么测试的?
1. どうやってテストしたの?
2. Don’t TOP-Posting:
I'm sorry not to reply in time.
The performance of this C code is about 10% better than the existing C code.
It will have a bigger improvement after X86 SIMD optimizations.
Xu Jun
- 原始邮件 -
发件人: "Carl Eugen Hoyos"
收件人: "FFmpeg development discussions and patches"
发送时间: 星期四, 2019年 11 月
Am Mi., 27. Nov. 2019 um 15:56 Uhr schrieb :
> From: Xu Jun
>
> In order to add x86 SIMD for filter_column(), I write a C function which
> processes 16 columns at a time.
How does this perform compared to the existing C code?
Carl Eugen
___
From: Xu Jun
In order to add x86 SIMD for filter_column(), I write a C function which
processes 16 columns at a time.
Signed-off-by: Xu Jun
---
libavfilter/vf_convolution.c | 56 +++
libavfilter/x86/vf_convolution_init.c | 23 +++
2 files changed, 79
11 matches
Mail list logo