Re: [FFmpeg-devel] [PATCH V1 1/3] lavu: Add alpha blending API based on row.

2018-09-25 Thread myp...@gmail.com
On Wed, Sep 26, 2018 at 6:58 AM Marton Balint  wrote:
>
>
>
> On Tue, 25 Sep 2018, Jun Zhao wrote:
>
> > Add alpha blending API based on row, support global alpha blending/
> > per-pixel blending, and add SSSE3/AVX2 optimizations of the functions.
>
> You might want to take a look at
> libavfilter/vf_framerate.c and libavfilter/x86/vf_framerate.asm as well,
> they do something similar. Maybe you should factorize that instead.
>
>
Yep, this is a good suggestion, I think we can factor this part and
supply a public 8bits/16bits blend API with SSSE3/AVX2 optimiztion,
then we can use the API in
vf_framerate/vf_blend (blend_normal_8bit/16bit)/vf_minterpolate (blend mode).
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH V1 1/3] lavu: Add alpha blending API based on row.

2018-09-25 Thread James Almer
On 9/25/2018 10:45 PM, myp...@gmail.com wrote:
> On Wed, Sep 26, 2018 at 3:55 AM Rostislav Pehlivanov 
> wrote:
>>
>> On 25 September 2018 at 16:27, Jun Zhao  wrote:
>>
>>> Add alpha blending API based on row, support global alpha blending/
>>> per-pixel blending, and add SSSE3/AVX2 optimizations of the functions.
>>>
> 
> 
>> We don't use inline asm on x86 and we don't use global contexts. Look at
>> how float_dsp is done.
> 
> I guess you precise mean "prefer NASM assembler over inline asm on x86". :)
> In fact,
> I know some x86 inline asm in FFmpeg, e,g libavcodec/x86/h264_cabac.
> (Use grep "__asm__ volatile" can find more x86 inline asm). And we need to
> update
> the inline asm on x86 rule in
> https://github.com/FFmpeg/FFmpeg/blob/master/doc/optimization.txt?

Yes, we still have some inline asm either because nobody has gotten
around to port it to NASM syntax after the project moved to it, or
because like with CABAC and some single instruction functions in
libavutil it makes sense being inline since the call overhead would kill
performance.

That document could use some polishing, but in any case, as stated in
the "Inline asm vs. external asm" section, we have for several years
required new code that calls external functions to be written in NASM
syntax, as it's the case with this patchset.

> 
> Thanks.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH V1 1/3] lavu: Add alpha blending API based on row.

2018-09-25 Thread myp...@gmail.com
On Wed, Sep 26, 2018 at 3:55 AM Rostislav Pehlivanov 
wrote:
>
> On 25 September 2018 at 16:27, Jun Zhao  wrote:
>
> > Add alpha blending API based on row, support global alpha blending/
> > per-pixel blending, and add SSSE3/AVX2 optimizations of the functions.
> >


> We don't use inline asm on x86 and we don't use global contexts. Look at
> how float_dsp is done.

I guess you precise mean "prefer NASM assembler over inline asm on x86". :)
In fact,
I know some x86 inline asm in FFmpeg, e,g libavcodec/x86/h264_cabac.
(Use grep "__asm__ volatile" can find more x86 inline asm). And we need to
update
the inline asm on x86 rule in
https://github.com/FFmpeg/FFmpeg/blob/master/doc/optimization.txt?

Thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH V1 1/3] lavu: Add alpha blending API based on row.

2018-09-25 Thread Marton Balint



On Tue, 25 Sep 2018, Jun Zhao wrote:


Add alpha blending API based on row, support global alpha blending/
per-pixel blending, and add SSSE3/AVX2 optimizations of the functions.


You might want to take a look at 
libavfilter/vf_framerate.c and libavfilter/x86/vf_framerate.asm as well, 
they do something similar. Maybe you should factorize that instead.


Regards,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH V1 1/3] lavu: Add alpha blending API based on row.

2018-09-25 Thread Rostislav Pehlivanov
On 25 September 2018 at 16:27, Jun Zhao  wrote:

> Add alpha blending API based on row, support global alpha blending/
> per-pixel blending, and add SSSE3/AVX2 optimizations of the functions.
>
> Signed-off-by: Jun Zhao 
> ---
>  libavutil/Makefile |2 +
>  libavutil/blend.c  |  101 
>  libavutil/blend.h  |   47 ++
>  libavutil/x86/Makefile |3 +-
>  libavutil/x86/blend.h  |   32 
>  libavutil/x86/blend_init.c |  369 ++
> ++
>  6 files changed, 553 insertions(+), 1 deletions(-)
>  create mode 100644 libavutil/blend.c
>  create mode 100644 libavutil/blend.h
>  create mode 100644 libavutil/x86/blend.h
>  create mode 100644 libavutil/x86/blend_init.c
>
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index 9ed24cf..f1c06e4 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -10,6 +10,7 @@ HEADERS = adler32.h
>\
>avstring.h\
>avutil.h  \
>base64.h  \
> +  blend.h   \
>blowfish.h\
>bprint.h  \
>bswap.h   \
> @@ -95,6 +96,7 @@ OBJS = adler32.o
> \
> audio_fifo.o \
> avstring.o   \
> base64.o \
> +   blend.o  \
> blowfish.o   \
> bprint.o \
> buffer.o \
> diff --git a/libavutil/blend.c b/libavutil/blend.c
> new file mode 100644
> index 000..e28efa0
> --- /dev/null
> +++ b/libavutil/blend.c
> @@ -0,0 +1,101 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
> + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> + */
> +
> +#include "libavutil/attributes.h"
> +#include "libavutil/cpu.h"
> +#include "libavutil/mem.h"
> +#include "libavutil/x86/asm.h"
> +#include "libavutil/blend.h"
> +
> +#include "libavutil/x86/blend.h"
> +
> +static void ff_global_blend_row_c(const uint8_t *src0,
> +  const uint8_t *src1,
> +  const uint8_t *alpha, /* XXX: only use
> alpha[0] */
> +  uint8_t *dst,
> +  int width)
> +{
> +int x;
> +for (x = 0; x < width - 1; x += 2) {
> +dst[0] = (src0[0] * alpha[0] + src1[0] * (255 - alpha[0]) + 255)
> >> 8;
> +dst[1] = (src0[1] * alpha[0] + src1[1] * (255 - alpha[0]) + 255)
> >> 8;
> +src0 += 2;
> +src1 += 2;
> +dst  += 2;
> +}
> +if (width & 1) {
> +dst[0] = (src0[0] * alpha[0] + src1[0] * (255 - alpha[0]) + 255)
> >> 8;
> +}
> +}
> +
> +void av_global_blend_row(const uint8_t *src0,
> + const uint8_t *src1,
> + const uint8_t *alpha,
> + uint8_t *dst,
> + int width)
> +{
> +blend_row blend_row_fn = NULL;
> +
> +#if ARCH_X86
> +blend_row_fn = ff_blend_row_init_x86(1);
> +#endif
> +
> +if (!blend_row_fn)
> +blend_row_fn = ff_global_blend_row_c;
> +
> +blend_row_fn(src0, src1, alpha, dst, width);
> +}
> +
> +static void ff_per_pixel_blend_row_c(const uint8_t *src0,
> + const uint8_t *src1,
> + const uint8_t *alpha,
> + uint8_t *dst,
> + int width)
> +{
> +int x;
> +for (x = 0; x < width - 1; x += 2) {
> +dst[0] = (src0[0] * alpha[0] + src1[0] * (255 - alpha[0]) + 255)
> >> 8;
> +dst[1] = (src0[1] * alpha[0] + src1[1] * (255 - alpha[0]) +