Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-01 Thread Fu, Ting


> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, November 29, 2019 04:51 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> On Thu, Nov 28, 2019 at 02:07:08PM +0800, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/yuv2rgb.c  |   5 +
> >  libswscale/x86/yuv2rgb_template.c |  58 ++-
> >  libswscale/x86/yuv_2_rgb.asm  | 163 +++---
> >  3 files changed, 208 insertions(+), 18 deletions(-)
> 
> breaks build on x86-32
> make
> X86ASMlibswscale/x86/yuv_2_rgb.o
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: label `BROADCAST' inconsistently
> redefined
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: note: label `BROADCAST' originally
> defined here
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined
> here
> make: *** [libswscale/x86/yuv_2_rgb.o] Error
> 

Hi Michael,

This error comes from the macro define of BROADCAST only under ARCH_X86_64.
And I have changed it into VBROADCASTSD(defined in x86util.asm) in PATCH V2.
What's more, the md5 test passed in linux32/64 and windows64.

Thank you for review.
Ting Fu

> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> I do not agree with what you have to say, but I'll defend to the death your 
> right
> to say it. -- Voltaire
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-11-28 Thread Michael Niedermayer
On Thu, Nov 28, 2019 at 02:07:08PM +0800, Ting Fu wrote:
> Signed-off-by: Ting Fu 
> ---
>  libswscale/x86/yuv2rgb.c  |   5 +
>  libswscale/x86/yuv2rgb_template.c |  58 ++-
>  libswscale/x86/yuv_2_rgb.asm  | 163 +++---
>  3 files changed, 208 insertions(+), 18 deletions(-)

breaks build on x86-32
make
X86ASM  libswscale/x86/yuv_2_rgb.o
src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:401: error: label `BROADCAST' inconsistently 
redefined
src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:400: note: label `BROADCAST' originally 
defined here
src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined here
src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined here
make: *** [libswscale/x86/yuv_2_rgb.o] Error 


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-11-27 Thread Fu, Ting


> -Original Message-
> From: ffmpeg-devel  On Behalf Of Carl
> Eugen Hoyos
> Sent: Thursday, November 28, 2019 02:29 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> 
> 
> > Am 28.11.2019 um 07:07 schrieb Ting Fu :
> >
> > +#if HAVE_SSSE3
> > +#define COMPILE_TEMPLATE_SSSE3 1
> > +#endif
> 
> Please add a line about performance to the commit message.
> 
> Carl Eugen

Hi Carl,

Sorry for the missing performance info, I tested it with raw YUV format video, 
the command is:
./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i input.yuv -vcodec rawvideo -s 
1920*1080 -pix_fmt rgb24 -f null /dev/null
The outputs are as follows on my local machine:
output fmt RGB24:
mmx: 337fps   ssse3: 634fps
 output fmt RGB32:
mmx: 375fps   ssse3: 653fps
output fmt RGB555:
mmx: 427fps   ssse3: 917fps
And I will add these infos in the PATCH V2.

Tank you
Tin Fu

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-11-27 Thread Carl Eugen Hoyos


> Am 28.11.2019 um 07:07 schrieb Ting Fu :
> 
> +#if HAVE_SSSE3
> +#define COMPILE_TEMPLATE_SSSE3 1
> +#endif

Please add a line about performance to the commit message.

Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-11-27 Thread Ting Fu
Signed-off-by: Ting Fu 
---
 libswscale/x86/yuv2rgb.c  |   5 +
 libswscale/x86/yuv2rgb_template.c |  58 ++-
 libswscale/x86/yuv_2_rgb.asm  | 163 +++---
 3 files changed, 208 insertions(+), 18 deletions(-)

diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c
index 70412a3914..d983934762 100644
--- a/libswscale/x86/yuv2rgb.c
+++ b/libswscale/x86/yuv2rgb.c
@@ -61,6 +61,11 @@ DECLARE_ASM_CONST(8, uint64_t, pb_07) = 
0x0707070707070707ULL;
 #define COMPILE_TEMPLATE_MMXEXT 1
 #endif /* HAVE_MMXEXT */
 
+//SSSE3 versions
+#if HAVE_SSSE3
+#define COMPILE_TEMPLATE_SSSE3 1
+#endif
+
 #include "yuv2rgb_template.c"
 
 av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
diff --git a/libswscale/x86/yuv2rgb_template.c 
b/libswscale/x86/yuv2rgb_template.c
index efe6356f30..fe586047f0 100644
--- a/libswscale/x86/yuv2rgb_template.c
+++ b/libswscale/x86/yuv2rgb_template.c
@@ -40,6 +40,30 @@
 const uint8_t *pv = src[2] +   (y >> vshift) * srcStride[2]; \
 x86_reg index = -h_size / 2; \
 
+extern void ff_yuv_420_rgb24_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuv_420_bgr24_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuv_420_rgb15_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuv_420_rgb16_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuv_420_rgb32_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuv_420_bgr32_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+   const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+   const uint8_t *py_2index);
+extern void ff_yuva_420_rgb32_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+const uint8_t *py_2index, const uint8_t 
*pa_2index);
+extern void ff_yuva_420_bgr32_ssse3(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
+const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
+const uint8_t *py_2index, const uint8_t 
*pa_2index);
 extern void ff_yuv_420_rgb24_mmxext(x86_reg index, uint8_t *image, const 
uint8_t *pu_index,
 const uint8_t *pv_index, const uint8_t 
*pointer_c_dither,
 const uint8_t *py_2index);
@@ -84,7 +108,12 @@ static inline int yuv420_rgb15(SwsContext *c, const uint8_t 
*src[],
 c->greenDither = ff_dither8[y   & 1];
 c->redDither   = ff_dither8[(y + 1) & 1];
 #endif
+
+#if COMPILE_TEMPLATE_SSSE3
+ff_yuv_420_rgb15_ssse3(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
+#else
 ff_yuv_420_rgb15_mmx(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
+#endif
 }
 return srcSliceH;
 }
@@ -102,7 +131,12 @@ static inline int yuv420_rgb16(SwsContext *c, const 
uint8_t *src[],
 c->greenDither = ff_dither4[y   & 1];
 c->redDither   = ff_dither8[(y + 1) & 1];
 #endif
+
+#if COMPILE_TEMPLATE_SSSE3
+ff_yuv_420_rgb16_ssse3(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
+#else
 ff_yuv_420_rgb16_mmx(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
+#endif
 }
 return srcSliceH;
 }
@@ -115,7 +149,9 @@ static inline int yuv420_rgb24(SwsContext *c, const uint8_t 
*src[],
 int y, h_size, vshift;
 YUV2RGB_LOOP(3)
 
-#if COMPILE_TEMPLATE_MMXEXT
+#if COMPILE_TEMPLATE_SSSE3
+ff_yuv_420_rgb24_ssse3(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
+#elif COMPILE_TEMPLATE_MMXEXT
 ff_yuv_420_rgb24_mmxext(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
 #else
 ff_yuv_420_rgb24_mmx(index, image, pu - index, pv - index, 
&(c->redDither), py - 2 * index);
@@ -132,7 +168,9 @@ static inline int yuv420_bgr24(SwsContext *c, const uint8_t 
*src[],
 int y, h_size,