from:"Fu, Ting"

Re: [FFmpeg-devel] [PATCH V6 1/3] lavfi/dnn: Mark native backend as unsupported

2023-04-27 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Thursday, April 27, 2023 11:24 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V6 1/3] lavfi/dnn: Mark native backend
> as unsupported
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> Ting
> > Fu
> > Sent: Monday, March 6, 2023 9:56 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V6 1/3] lavfi/dnn: Mark native backend
> > as unsupported
> >
> > Native is deprecated value for backed_type option. Modify related
> > error
> 
> Native backend will be removed, and so change the interface first.
> 
> > message.
> >
Hi Yejun,
Thank you for review, modified commit message in PATCH V7.
[...]
> > --
> > 2.17.1
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V6 1/3] lavfi/dnn: Mark native backend as unsupported

2023-03-06 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting
> Fu
> Sent: Monday, March 6, 2023 09:56 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V6 1/3] lavfi/dnn: Mark native backend as
> unsupported
> 
> Native is deprecated value for backed_type option. Modify related error
> message.
> 
> Signed-off-by: Ting Fu 
> ---
>  libavfilter/dnn/dnn_interface.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
[...]
Compared with PATCH V5, only restored some content incorrectly deleted in 
doc/filters.texi, and rebase commit on latest code.
> --
> 2.17.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V4 1/3] lavfi/dnn: Mark native backend as unsupported

2023-01-15 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting Fu
> Sent: Friday, January 6, 2023 05:19 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V4 1/3] lavfi/dnn: Mark native backend as
> unsupported
> 
> Native is deprecated value for backed_type option. Modify realted error
> message.
> 
> Signed-off-by: Ting Fu 
> ---
>  libavfilter/dnn/dnn_interface.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/libavfilter/dnn/dnn_interface.c b/libavfilter/dnn/dnn_interface.c
> index 554a36b0dc..5b1695a1dd 100644
> --- a/libavfilter/dnn/dnn_interface.c
> +++ b/libavfilter/dnn/dnn_interface.c
> @@ -24,7 +24,6 @@
Kindly ping.
[...]
> --
> 2.17.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 1/3] lavfi/dnn: Mark native backend as deprecated

2023-01-06 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting
> Fu
> Sent: Friday, January 6, 2023 05:02 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V3 1/3] lavfi/dnn: Mark native backend as
> deprecated
> 
> Mark native as deprecated for backed_type option. Modify realted error
> message.
> 
> Signed-off-by: Ting Fu 
> ---
>  libavfilter/dnn/dnn_interface.c | 12 
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
Sorry for the incorrect patch, please omit V3 and check V4.
[...]
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/2] lavfi/dnn: Modify error message for incorrect backend_type

2023-01-06 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Thursday, January 5, 2023 09:07 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/2] lavfi/dnn: Modify error message
> for incorrect backend_type
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> Ting
> > Fu
> > Sent: Monday, January 2, 2023 11:50 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V2 1/2] lavfi/dnn: Modify error message
> > for incorrect backend_type
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/dnn/dnn_interface.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/libavfilter/dnn/dnn_interface.c
> > b/libavfilter/dnn/dnn_interface.c index 554a36b0dc..fa484c0905 100644
> > --- a/libavfilter/dnn/dnn_interface.c
> > +++ b/libavfilter/dnn/dnn_interface.c
> > @@ -71,7 +71,7 @@ DNNModule *ff_get_dnn_module(DNNBackendType
> > backend_type)
> >  #endif
> >  break;
> >  default:
> > -av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or
> > tensorflow\n");
> > +av_log(NULL, AV_LOG_ERROR, "Module backend_type is not
> > + supported or enabled.\n");
> 
> We need to remove "case DNN_NATIVE:" in this patch, and so native
> backend will go here 'default:'.

Hi Yejun,

Updated it in PATCH V3.
> 
> >  av_freep(_module);
> >  return NULL;
> >  }
> 
> Please also update doc/filters.texi in this commit, thanks.

Sure, I delete the native part in doc/filers.texi and its related files in 
tools/python/,
and make it three commits. Hope this make sense.

Thank you
Ting Fu
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] lavfi/dnn: Remove DNN native backend

2023-01-03 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Marton Balint
> Sent: Sunday, January 1, 2023 06:20 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] lavfi/dnn: Remove DNN native
> backend
> 
> 
> 
> On Fri, 30 Dec 2022, Ting Fu wrote:
> 
> > According to discussion in
> > https://etherpad.mit.edu/p/FF_dev_meeting_20221202.
> > The DNN native backend should be removed at first step.
> > All the DNN native backend related code is deleted.
> 
> You should explain why it is being removed. The cited URL is not giving
> any explanations.
> 
> Thanks,
> Marton
> 
Hi Marton,

Hope the email here can explain: 
http://ffmpeg.org/pipermail/ffmpeg-devel/2022-December/304534.html
As I noticed, the native backend only supports a few models, which was an 
outcome after many layers' implementation in FFmpeg DNN module. Naturally, it 
would take many other works to support other models.
What's more, due to its' ordinary, unsatisfied performance, almost users choose 
other well-developed DNN framework for inference.

I believe that's the main reason to remove native from DNN module.

Thank you
Ting Fu
> 
> >
> > Signed-off-by: Ting Fu 
> > ---
> > libavfilter/dnn/Makefile  |  10 -
> > libavfilter/dnn/dnn_backend_native.c  | 561 --
> > libavfilter/dnn/dnn_backend_native.h  | 149 -
> > .../dnn/dnn_backend_native_layer_avgpool.c| 147 -
> > .../dnn/dnn_backend_native_layer_avgpool.h|  69 ---
> > .../dnn/dnn_backend_native_layer_conv2d.c | 265 -
> > .../dnn/dnn_backend_native_layer_conv2d.h |  68 ---
> > .../dnn/dnn_backend_native_layer_dense.c  | 151 -
> > .../dnn/dnn_backend_native_layer_dense.h  |  65 --
> > .../dnn_backend_native_layer_depth2space.c| 102 
> > .../dnn_backend_native_layer_depth2space.h|  72 ---
> > .../dnn/dnn_backend_native_layer_mathbinary.c | 193 --
> > .../dnn/dnn_backend_native_layer_mathbinary.h |  54 --
> > .../dnn/dnn_backend_native_layer_mathunary.c  | 156 -
> > .../dnn/dnn_backend_native_layer_mathunary.h  |  92 ---
> > .../dnn/dnn_backend_native_layer_maximum.c|  83 ---
> > .../dnn/dnn_backend_native_layer_maximum.h|  44 --
> > .../dnn/dnn_backend_native_layer_pad.c| 268 -
> > .../dnn/dnn_backend_native_layer_pad.h|  43 --
> > libavfilter/dnn/dnn_backend_native_layers.c   |  42 --
> > libavfilter/dnn/dnn_backend_native_layers.h   |  38 --
> > libavfilter/dnn/dnn_backend_tf.c  | 368 +---
> > libavfilter/dnn/dnn_interface.c   |  10 +-
> > libavfilter/tests/dnn-layer-avgpool.c | 197 --
> > libavfilter/tests/dnn-layer-conv2d.c  | 248 
> > libavfilter/tests/dnn-layer-dense.c   | 131 
> > libavfilter/tests/dnn-layer-depth2space.c | 102 
> > libavfilter/tests/dnn-layer-mathbinary.c  | 214 ---
> > libavfilter/tests/dnn-layer-mathunary.c   | 148 -
> > libavfilter/tests/dnn-layer-maximum.c |  71 ---
> > libavfilter/tests/dnn-layer-pad.c | 239 
> > tests/Makefile|   1 -
> > tests/fate/dnn.mak|  45 --
> > 33 files changed, 6 insertions(+), 4440 deletions(-)
> > delete mode 100644 libavfilter/dnn/dnn_backend_native.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native.h
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_conv2d.h
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_dense.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_dense.h
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_depth2space.h
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_mathbinary.c
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_mathbinary.h
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_mathunary.c
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_mathunary.h
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_maximum.c
> > delete mode 100644
> libavfilter/dnn/dnn_backend_native_layer_maximum.h
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.h
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layers.c
> > delete mode 100644 libavfilter/dnn/dnn_backend_native_layers.h
> > delete mode 100644 libavfilter/tests/dnn-layer-avgpool.c
> > delete mode 100644 libavfilter/tests/dnn-layer-conv2d.c
> > delete mode 100644 libavfilter/tests/dnn-layer-dense.c
> > delete mode 100644

Re: [FFmpeg-devel] [PATCH 2/2] lavfi/dnn: Remove DNN native backend

2023-01-02 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Monday, January 2, 2023 07:26 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] lavfi/dnn: Remove DNN native
> backend
> 
> On Fri, Dec 30, 2022 at 04:42:56PM +0800, Ting Fu wrote:
> > According to discussion in
> > https://etherpad.mit.edu/p/FF_dev_meeting_20221202.
> > The DNN native backend should be removed at first step.
> > All the DNN native backend related code is deleted.
> >
> > Signed-off-by: Ting Fu 
> 
> This patch seems breaking
> 
> make testprogs
> make: *** No rule to make target 'libavfilter/tests/dnn-layer-avgpool.c',
> needed by 'libavfilter/tests/dnn-layer-avgpool.o'.  Stop.
> 
> and with a distclean and some configure
> 
> make testprogs
> LDlibavfilter/tests/dnn-layer-avgpool
> gcc: error: libavfilter/tests/dnn-layer-avgpool.o: No such file or directory
> ffbuild/library.mak:118: recipe for target 
> 'libavfilter/tests/dnn-layer-avgpool'
> failed
> make: *** [libavfilter/tests/dnn-layer-avgpool] Error 1
Hi Michael,

Sorry for 'make testprogs' failed.
This was caused by not deleting code in make target 'testprogs' in 
libavfilter/Makefile.
I have done it and updated V2. Local make passed for all targets.

Thank you
Ting Fu
> 
> [...]
> 
> --
> Michael GnuPG fingerprint:
> 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> The educated differ from the uneducated as much as the living from the
> dead. -- Aristotle
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi/dnn: Fix OpenVINO missing model file corrupt issue.

2022-08-04 Thread Fu, Ting

Hi Anton,

Thank you for comment. 
After double checked the OpenVINO, it is true that the code would corrupt if 
the binary file does not exist.
We would have nothing to do in this case, that's why I code to check the file 
existence explicitly.
Yes, you are right, it is not a proper way to do like this. But I have no idea 
how to solve it more decently, since trying to open it as you mentioned would 
lead to crush immediately. Maybe there is some solution I don’t know, any 
further input would be appreciated. 

PS, it is not a good commit message since I have not fixed this issue, it's 
just a workaround. I would modify it in next version.

Thank you,
Ting FU
> -Original Message-
> From: ffmpeg-devel  On Behalf Of Anton
> Khirnov
> Sent: Thursday, August 4, 2022 06:40 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi/dnn: Fix OpenVINO missing model file
> corrupt issue.
> 
> Quoting Ting Fu (2022-08-04 11:31:01)
> > DNN OpenVINO backend would not report missing model file if it does
> > not exist. It would corrupt directly with out any error infomation.
> > This commit
> 
> "corrupt"?
> 
> The patch looks completely wrong. Testing for file existence explicitly is 
> known
> to be a bad pattern that leads to all kinds of races, security issues, and 
> other
> bugs. Just trying to open the file and returning an error if that fails is 
> the right
> thing to do.
> 
> --
> Anton Khirnov
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavf/sr: fix the segmentation fault caused by incorrect input frame free.

2022-07-21 Thread Fu, Ting

Kindly ping

> -Original Message-
> From: ffmpeg-devel  On Behalf Of Paul B
> Mahol
> Sent: Monday, June 27, 2022 07:03 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] lavf/sr: fix the segmentation fault caused
> by incorrect input frame free.
> 
> lgtm
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of DNN backend

2022-05-24 Thread Fu, Ting

> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Jean-Baptiste Kempf
> Sent: Tuesday, May 24, 2022 10:52 PM
> To: ffmpeg-devel 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of
> DNN backend
> 
> Hello,
> 
> On Tue, 24 May 2022, at 16:03, Fu, Ting wrote:
> > I am trying to add this backend since we got some users who have
> > interest in doing PyTorch model(BasicVSR model) inference with FFmpeg.
> 
> I think you are missing my point here.
> We already have 3 backends (TF, Native, OpenVino) in FFmpeg.
> Those are not to support different hardware, but different tastes for users,

Hi Jean-Baptiste,

Yes, you are right, we already got three backends with FFmpeg DNN. But for now, 
the native backend is barely workable, due to its layers and operations weak 
support.
And we do support different hardware. Like, the OpenVINO backend supports 
inference with Intel GPU. For now, the TensorFlow and OpenVINO backend support 
some models, which include Super Resolution model, object detect model, object 
classify model. I think it's not only a teste difference for users, but an 
option for them to choose for their work implementation. AFAIK, there are some 
individuals and organizations who are using FFmpeg DNN.

> who prefer one API to another one.
> Where does it end? How many of those backends will we get? 10?
> 
> What's the value to do that development inside ffmpeg?
> 

I think you are concerning why we need such backend. Because the users want to 
infer the BasicVSR and other VSR(video super solution) model. Those models are 
most implemented with PyTorch. And it can cause several issues if we convert 
such model to the other AI model file. Besides, the video codec is an advantage 
of FFmpeg framework, which can support various of hardware acceleration. We 
would like to utilize this framework to enhance the performance of AI inference 
and improve the user experience.
What I want to emphasis is that the LibTorch backend is not for adding patches 
but an actual requirement.

Thank you.
Ting FU

> --
> Jean-Baptiste Kempf -  President
> +33 672 704 734
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of DNN backend

2022-05-24 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Soft
> Works
> Sent: Tuesday, May 24, 2022 10:24 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of
> DNN backend
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Fu,
> > Ting
> > Sent: Tuesday, May 24, 2022 4:03 PM
> > To: FFmpeg development discussions and patches
> > 
> > Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as
> > one of DNN backend
> >
> > Hi Jean-Baptiste,
> >
> > I am trying to add this backend since we got some users who have
> > interest in doing PyTorch model(BasicVSR model) inference with FFmpeg.
> > And as we all know, the PyTorch is one of the most popular AI
> > inference engines and it has large number of models. So, I think if
> > LibTorch is one of FFmpeg DNN backend, would help the PyTorch users a
> lot.
> >
> > PS, ONNX is not in my plan. I am going to improve the LibTorch backend
> > performance and make it compatible with more models in next steps.
> >
> > Thank you.
> > Ting FU
> 
> Hi Ting,
> 
> I've never looked at the DNN part in ffmpeg, so just out of curiosity:
> 
> Is this working 1-way or 2-way? What I mean is whether this is just about
> feeding images to the AI engines or does the ffmpeg filter get some data in
> return for each frame that is processed?

Hi Softworkz,

Since the DNN is a part of FFmpeg libavfilter, so it can work with other 
filters. Other filters can get the output(metadata or just frames) from DNN.

> 
> So for example, in case of object identification/tracking, is it possible to 
> get
> identified rectangles back from the inference result, attach it to an AVFrame
> so that a downstream filter could paint those rectangles on each video frame?
> 

Yes, for your example object identification, we preserved the output in 
structure AVFrameSideData of AVFrame. So, the following filters can use such 
data.
And for now, the AVFrameSideData we saved contains bounding box, the object 
position info, and the object category and confidence.

Thank you.
Ting FU

> Thanks,
> softworkz
> 
> 
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of DNN backend

2022-05-24 Thread Fu, Ting

Hi Jean-Baptiste,

I am trying to add this backend since we got some users who have interest in 
doing PyTorch model(BasicVSR model) inference with FFmpeg.
And as we all know, the PyTorch is one of the most popular AI inference engines 
and it has large number of models. So, I think if LibTorch is one of FFmpeg DNN 
backend, would help the PyTorch users a lot.

PS, ONNX is not in my plan. I am going to improve the LibTorch backend 
performance and make it compatible with more models in next steps.

Thank you.
Ting FU

> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Jean-Baptiste Kempf
> Sent: Monday, May 23, 2022 05:51 PM
> To: ffmpeg-devel 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of
> DNN backend
> 
> Hello,
> 
> Are we seriously going to add all backends for ML in FFmpeg? Next one is
> ONNNX?
> 
> jb
> 
> On Mon, 23 May 2022, at 11:29, Ting Fu wrote:
> > PyTorch is an open source machine learning framework that accelerates
> > the path from research prototyping to production deployment. Official
> > websit: https://pytorch.org/. We call the C++ library of PyTorch as
> > LibTorch, the same below.
> >
> > To build FFmpeg with LibTorch, please take following steps as reference:
> > 1. download LibTorch C++ library in
> > https://pytorch.org/get-started/locally/,
> > please select C++/Java for language, and other options as your need.
> > 2. unzip the file to your own dir, with command unzip
> > libtorch-shared-with-deps-latest.zip -d your_dir 3. export
> > libtorch_root/libtorch/include and
> > libtorch_root/libtorch/include/torch/csrc/api/include to $PATH export
> > libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH 4. config FFmpeg with
> > ../configure --enable-libtorch
> > --extra-cflag=-I/libtorch_root/libtorch/include
> > --extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include
> > --extra-ldflags=-L/libtorch_root/libtorch/lib/
> > 5. make
> >
> > To run FFmpeg DNN inference with LibTorch backend:
> > ./ffmpeg -i input.jpg -vf
> > dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y
> output.jpg
> > The LibTorch_model.pt can be generated by Python with
> > torch.jit.script() api. Please note, torch.jit.trace() is not
> > recommanded, since it does not support ambiguous input size.
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  configure |   7 +-
> >  libavfilter/dnn/Makefile  |   1 +
> >  libavfilter/dnn/dnn_backend_torch.cpp | 567
> ++
> >  libavfilter/dnn/dnn_backend_torch.h   |  47 +++
> >  libavfilter/dnn/dnn_interface.c   |  12 +
> >  libavfilter/dnn/dnn_io_proc.c | 117 +-
> >  libavfilter/dnn_filter_common.c   |  31 +-
> >  libavfilter/dnn_interface.h   |   3 +-
> >  libavfilter/vf_dnn_processing.c   |   3 +
> >  9 files changed, 774 insertions(+), 14 deletions(-)  create mode
> > 100644 libavfilter/dnn/dnn_backend_torch.cpp
> >  create mode 100644 libavfilter/dnn/dnn_backend_torch.h
> >
> > diff --git a/configure b/configure
> > index f115b21064..85ce3e67a3 100755
> > --- a/configure
> > +++ b/configure
> > @@ -279,6 +279,7 @@ External library support:
> >--enable-libtheora   enable Theora encoding via libtheora [no]
> >--enable-libtls  enable LibreSSL (via libtls), needed for
> > https support
> > if openssl, gnutls or mbedtls is not used
> > [no]
> > +  --enable-libtorchenable Torch as one DNN backend
> >--enable-libtwolame  enable MP2 encoding via libtwolame [no]
> >--enable-libuavs3d   enable AVS3 decoding via libuavs3d [no]
> >--enable-libv4l2 enable libv4l2/v4l-utils [no]
> > @@ -1850,6 +1851,7 @@ EXTERNAL_LIBRARY_LIST="
> >  libopus
> >  libplacebo
> >  libpulse
> > +libtorch
> >  librabbitmq
> >  librav1e
> >  librist
> > @@ -2719,7 +2721,7 @@ dct_select="rdft"
> >  deflate_wrapper_deps="zlib"
> >  dirac_parse_select="golomb"
> >  dovi_rpu_select="golomb"
> > -dnn_suggest="libtensorflow libopenvino"
> > +dnn_suggest="libtensorflow libopenvino libtorch"
> >  dnn_deps="avformat swscale"
> >  error_resilience_select="me_cmp"
> >  faandct_deps="faan"
> > @@ -6600,6 +6602,7 @@ enabled libopus   && {
> >  }
> >  enabled libplacebo&& require_pkg_config libplacebo "libplacebo
> > >= 4.192.0" libplacebo/vulkan.h pl_vulkan_create
> >  enabled libpulse  && require_pkg_config libpulse libpulse
> > pulse/pulseaudio.h pa_context_new
> > +enabled libtorch  && add_cppflags -D_GLIBCXX_USE_CXX11_ABI=0
> > && check_cxxflags -std=c++14 && require_cpp libtorch torch/torch.h
> > "torch::Tensor" -ltorch -lc10 -ltorch_cpu -lstdc++ -lpthread
> >  enabled librabbitmq   && require_pkg_config librabbitmq
> > "librabbitmq >= 0.7.1" amqp.h amqp_new_connection
> >  enabled librav1e  && require_pkg_config librav1e "rav1e >=
> > 0.4.0" rav1e.h rav1e_context_new
> >  enabled librist   &&

Re: [FFmpeg-devel] [PATCH v2 6/6] doc/filters.texi: Include dnn_processing in docs of sr and derain filter

2021-08-23 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Shubhanshu Saxena
> Sent: Saturday, August 21, 2021 03:59 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Shubhanshu Saxena 
> Subject: [FFmpeg-devel] [PATCH v2 6/6] doc/filters.texi: Include 
> dnn_processing
> in docs of sr and derain filter
> 
> Signed-off-by: Shubhanshu Saxena 
> ---
>  doc/filters.texi | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/filters.texi b/doc/filters.texi index f7b6b61f4c..0b2da7c71f
> 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -9974,7 +9974,7 @@ Note that different backends use different file
> formats. TensorFlow and native  backend can load files for only its format.
>  @end table
> 
> -It can also be finished with @ref{dnn_processing} filter.
> +To get full functionality (such as async execution), please use the
> @ref{dnn_processing} filter.
> 
>  @section deshake
> 
> @@ -19263,7 +19263,7 @@ Default value is @code{2}. Scale factor is
> necessary for SRCNN model, because it  input upscaled using bicubic upscaling
> with proper scale factor.
>  @end table
> 
> -This feature can also be finished with @ref{dnn_processing} filter.
> +To get full functionality (such as async execution), please use the
> @ref{dnn_processing} filter.
> 
>  @section ssim
> 
> --
> 2.25.1

Hi, Shubhanshu
It crushed with command:
./ffmpeg -i cici.jpg -vf 
dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:backend_configs='async=0',showinfo
 -f null -
It crushed only when async=0.
Please re-check the code with all dnn backends. You can get the necessary model 
files above in 
https://github.com/guoyejun/ffmpeg_dnn/tree/main/models/openvino/2021.1.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3 9/9] [GSoC] lavfi/dnn: DNNAsyncExecModule Execution Failure Handling

2021-08-09 Thread Fu, Ting




> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Shubhanshu Saxena
> Sent: 2021年8月8日 18:56
> To: ffmpeg-devel@ffmpeg.org
> Cc: Shubhanshu Saxena 
> Subject: [FFmpeg-devel] [PATCH v3 9/9] [GSoC] lavfi/dnn:
> DNNAsyncExecModule Execution Failure Handling
> 
> This commit adds the case handling if the asynchronous execution of a request
> fails by checking the exit status of the thread when joining before starting
> another execution. On failure, it does the cleanup as well.
> 
> Signed-off-by: Shubhanshu Saxena 
> ---
>  libavfilter/dnn/dnn_backend_common.c | 23 +++
>  libavfilter/dnn/dnn_backend_tf.c | 10 +-
>  2 files changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/libavfilter/dnn/dnn_backend_common.c
> b/libavfilter/dnn/dnn_backend_common.c
> index 470fffa2ae..426683b73d 100644
> --- a/libavfilter/dnn/dnn_backend_common.c
> +++ b/libavfilter/dnn/dnn_backend_common.c
> @@ -23,6 +23,9 @@
> 
>  #include "dnn_backend_common.h"
> 
> +#define DNN_ASYNC_SUCCESS (void *)0
> +#define DNN_ASYNC_FAIL (void *)-1
> +
>  int ff_check_exec_params(void *ctx, DNNBackendType backend,
> DNNFunctionType func_type, DNNExecBaseParams *exec_params)  {
>  if (!exec_params) {
> @@ -79,18 +82,25 @@ static void *async_thread_routine(void *args)
>  DNNAsyncExecModule *async_module = args;
>  void *request = async_module->args;
> 
> -async_module->start_inference(request);
> +if (async_module->start_inference(request) != DNN_SUCCESS) {
> +return DNN_ASYNC_FAIL;
> +}
>  async_module->callback(request);
> -return NULL;
> +return DNN_ASYNC_SUCCESS;
>  }
> 
>  DNNReturnType ff_dnn_async_module_cleanup(DNNAsyncExecModule
> *async_module)  {
> +void *status = 0;
>  if (!async_module) {
>  return DNN_ERROR;
>  }
>  #if HAVE_PTHREAD_CANCEL
> -pthread_join(async_module->thread_id, NULL);
> +pthread_join(async_module->thread_id, );
> +if (status == DNN_ASYNC_FAIL) {
> +av_log(NULL, AV_LOG_ERROR, "Last Inference Failed.\n");
> +return DNN_ERROR;
> +}
>  #endif
>  async_module->start_inference = NULL;
>  async_module->callback = NULL;
> @@ -101,6 +111,7 @@ DNNReturnType
> ff_dnn_async_module_cleanup(DNNAsyncExecModule *async_module)
> DNNReturnType ff_dnn_start_inference_async(void *ctx,
> DNNAsyncExecModule *async_module)  {
>  int ret;
> +void *status = 0;
> 
>  if (!async_module) {
>  av_log(ctx, AV_LOG_ERROR, "async_module is null when starting async
> inference.\n"); @@ -108,7 +119,11 @@ DNNReturnType
> ff_dnn_start_inference_async(void *ctx, DNNAsyncExecModule *async_
>  }
> 
>  #if HAVE_PTHREAD_CANCEL
> -pthread_join(async_module->thread_id, NULL);
> +pthread_join(async_module->thread_id, );
> +if (status == DNN_ASYNC_FAIL) {
> +av_log(ctx, AV_LOG_ERROR, "Unable to start inference as previous
> inference failed.\n");
> +return DNN_ERROR;
> +}
>  ret = pthread_create(_module->thread_id, NULL,
> async_thread_routine, async_module);
>  if (ret != 0) {
>  av_log(ctx, AV_LOG_ERROR, "Unable to start async inference.\n"); 
> diff --git
> a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c
> index fb3f6f5ea6..ffec1b1328 100644
> --- a/libavfilter/dnn/dnn_backend_tf.c
> +++ b/libavfilter/dnn/dnn_backend_tf.c
> @@ -91,6 +91,7 @@ AVFILTER_DEFINE_CLASS(dnn_tensorflow);
> 
>  static DNNReturnType execute_model_tf(TFRequestItem *request, Queue
> *inference_queue);  static void infer_completion_callback(void *args);
> +static inline void destroy_request_item(TFRequestItem **arg);
> 
>  static void free_buffer(void *data, size_t length)  { @@ -172,6 +173,10 @@
> static DNNReturnType tf_start_inference(void *args)
>request->status);
>  if (TF_GetCode(request->status) != TF_OK) {
>  av_log(_model->ctx, AV_LOG_ERROR, "%s", TF_Message(request-
> >status));
> +tf_free_request(infer_request);
> +if (ff_safe_queue_push_back(tf_model->request_queue, request) < 0) {
> +destroy_request_item();
> +}
>  return DNN_ERROR;
>  }
>  return DNN_SUCCESS;
> @@ -1095,7 +1100,10 @@ static DNNReturnType
> execute_model_tf(TFRequestItem *request, Queue *inference_q
>  }
> 
>  if (task->async) {
> -return ff_dnn_start_inference_async(ctx, >exec_module);
> +if (ff_dnn_start_inference_async(ctx, >exec_module) !=
> DNN_SUCCESS) {
> +goto err;
> +}
> +return DNN_SUCCESS;
>  } else {
>  if (tf_start_inference(request) != DNN_SUCCESS) {
>  goto err;
> --
> 2.25.1

LGTM, those patches function well and tensorflow backend performs much better.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email

Re: [FFmpeg-devel] [PATCH V2 2/3] dnn/openvino: refine code for better model initialization

2021-01-17 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Monday, January 18, 2021 08:50 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/3] dnn/openvino: refine code for
> better model initialization
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: 2021年1月15日 16:43
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V2 2/3] dnn/openvino: refine code for
> > better model initialization
> >
> > Move openvino model/inference request creation and initialization
> > steps from ff_dnn_load_model_ov to new function init_model_ov, for
> > later input resize support.
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/dnn/dnn_backend_openvino.c | 196
> > ++---
> >  1 file changed, 111 insertions(+), 85 deletions(-)
> >
> > -
> > -item->tasks = av_malloc_array(ctx->options.batch_size,
> > sizeof(*item->tasks));
> > -if (!item->tasks) {
> > -av_freep();
> > -goto err;
> > -}
> > -item->task_count = 0;
> 
> these code are missed in the new added function init_model_ov with rebase

Hi Yejun,
Thank you for your review.
These codes have been fixed in PATCH V3.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] libavfilter/dnn: add batch mode for async execution

2021-01-14 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Sunday, January 10, 2021 09:16 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Guo, Yejun 
> Subject: [FFmpeg-devel] [PATCH V2] libavfilter/dnn: add batch mode for async
> execution
> 
> the default number of batch_size is 1
> 
> Signed-off-by: Xie, Lin 
> Signed-off-by: Wu Zhiwen 
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_openvino.c | 187 -
>  libavfilter/dnn/dnn_backend_openvino.h |   1 +
>  libavfilter/dnn/dnn_interface.c|   1 +
>  libavfilter/dnn_interface.h|   2 +
>  libavfilter/vf_dnn_processing.c|  36 -
>  5 files changed, 194 insertions(+), 33 deletions(-)
> 
[...]
>  if (ff_inlink_acknowledge_status(inlink, , )) {
>  if (status == AVERROR_EOF) {
> -ff_outlink_set_status(outlink, status, pts);
> +int64_t out_pts = pts;
> +ret = flush_frame(outlink, pts, _pts);
> +ff_outlink_set_status(outlink, status, out_pts);
>  return ret;
>  }
>  }
> --
> 2.17.1

Hi Yejun,
This patch works well for me.
Testing was carried on my machine, which CPU is i7-8700K 3.7Ghz and iGPU is 
UHD630.
The patch was tested by using espcn super resolution model (950*540 video as 
input), with async on and off. The fps increased from 11fps to 13fps (~18% up) 
on CPU, from 8fps to 11fps (~37% up) on iGPU.

On CPU with async off:
./ffmpeg -i input_video.mp4 -vf 
dnn_processing=dnn_backend=openvino:model=espcn1080p.xml:input=x:output=espcn/prediction:async=0:options=device=CPU\_size=1
 -y output_video.mp4
On CPU with async on:
./ffmpeg -i input_video.mp4 -vf 
dnn_processing=dnn_backend=openvino:model=espcn1080p.xml:input=x:output=espcn/prediction:async=1:options=device=CPU\_size=2
 -y output_video.mp4

On GPU with async off:
./ffmpeg -i input_video.mp4 -vf 
dnn_processing=dnn_backend=openvino:model=espcn1080p.xml:input=x:output=espcn/prediction:async=0:options=device=GPU\_size=1
 -y output_video.mp4
On GPU with async on:
./ffmpeg -i input_video.mp4 -vf 
dnn_processing=dnn_backend=openvino:model=espcn1080p.xml:input=x:output=espcn/prediction:async=1:options=device=GPU\_size=2
 -y output_video.mp4

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: unify error return to DNN_ERROR

2020-08-23 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Friday, August 21, 2020 06:57 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: unify error return to
> DNN_ERROR
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: 2020年8月21日 11:47
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: unify error return
> > to DNN_ERROR
> >
> > Unify all error return as DNN_ERROR, in order to cease model executing
> > when return error in ff_dnn_execute_model_native layer_func.pf_exec
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/dnn/dnn_backend_native_layer_avgpool.c | 2 +-
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.c  | 4 ++--
> >  libavfilter/dnn/dnn_backend_native_layer_depth2space.c | 4 ++--
> > libavfilter/dnn/dnn_backend_native_layer_mathbinary.c  | 2 +-
> >  libavfilter/dnn/dnn_backend_native_layer_mathunary.c   | 2 +-
> >  libavfilter/dnn/dnn_backend_native_layer_pad.c | 4 ++--
> >  6 files changed, 9 insertions(+), 9 deletions(-)
> >
> 
> we'd better move the following change to patch 1/2 from patch 2/2, so this
> patch is complete.

Sure, thank you for your review.

> 
> -layer_funcs[layer_type].pf_exec(native_model->operands,
> -
> native_model->layers[layer].input_operand_indexes,
> -
> native_model->layers[layer].output_operand_index,
> -native_model->layers[layer].params);
> +if (layer_funcs[layer_type].pf_exec(native_model->operands,
> +
> native_model->layers[layer].input_operand_indexes,
> +
> native_model->layers[layer].output_operand_index,
> +
> native_model->layers[layer].params) == DNN_ERROR)
> {
> +return DNN_ERROR;
> +}
> 
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] dnn: move output name from DNNModel.set_input_output to DNNModule.execute_model

2020-08-21 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Friday, August 21, 2020 01:34 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V2] dnn: move output name from
> DNNModel.set_input_output to DNNModule.execute_model
> 
> currently, output is set both at DNNModel.set_input_output and
> DNNModule.execute_model, it makes sense that the output name is provided at
> model inference time so all the output info is set at a single place.
> 
> and so DNNModel.set_input_output is renamed to DNNModel.set_input
> 
> Signed-off-by: Guo, Yejun 
> ---
> v2: rebase with master
> 
>  libavfilter/dnn/dnn_backend_native.c   | 44 +++--
>  libavfilter/dnn/dnn_backend_native.h   |  4 +-
>  libavfilter/dnn/dnn_backend_openvino.c | 50 +--
> libavfilter/dnn/dnn_backend_openvino.h |  2 +-
>  libavfilter/dnn/dnn_backend_tf.c   | 87 
> ++
>  libavfilter/dnn/dnn_backend_tf.h   |  2 +-
>  libavfilter/dnn_interface.h|  4 +-
>  libavfilter/vf_derain.c|  6 +--
>  libavfilter/vf_dnn_processing.c|  9 ++--
>  libavfilter/vf_sr.c| 11 +++--
>  10 files changed, 82 insertions(+), 137 deletions(-)
> 
[...]
> --
> 2.7.4

LGTM, all three backends(Native/TF/OV) function well.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: add native support for avg_pool

2020-08-03 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Monday, August 3, 2020 10:20 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: add native support for
> avg_pool
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: 2020年7月30日 18:03
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: add native support
> > for avg_pool
> >
> > Not support pooling strides in channel dimension now.
> > It can be tested with the model generated with below python script:
> >
> > import tensorflow as tf
> > import numpy as np
> > import imageio
> >
> > in_img = imageio.imread('input_odd.jpg') in_img =
> > in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :]
> >
> > x = tf.placeholder(tf.float32, shape=[1, None, None, 3],
> > name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1],
> > strides=[1,2,2,1], padding='SAME')
> 
> I don't see input/output channel is set in this function parameter, why we 
> need
> them in struct AvgPoolParams?

Hi Yejun,

The in/out_channel in struct AvgPoolParams are for assert() the channel of 
input image is equal to model params.
If read image with  rgba then the native will not process correctly, so I think 
this part is necessary.

> 
[...]
> > +if (avgpool_params->padding_method == SAME) {
> > +height_end = height;
> > +width_end = width;
> > +height_radius = avgpool_params->kernel_size - ((height - 1) %
> > kernel_strides + 1);
> > +width_radius = avgpool_params->kernel_size - ((width - 1) %
> > kernel_strides + 1);
> > +height_radius = height_radius < 0 ? 0 : height_radius >> 1;
> > +width_radius = width_radius < 0 ? 0 : width_radius >> 1;
> > +output_height = ceil(height / (kernel_strides * 1.0));
> > +output_width = ceil(width / (kernel_strides * 1.0));
> > +} else {
> []
> add an assert here, since avg_pool only accepts 'VALID' or 'SAME', while
> padding_method has three enum.
> avassert0(padding_method==VALID)
> 

Sure, will add in patch V4.

> > +height_end = height - avgpool_params->kernel_size + 1;
[...]
> > diff --git a/libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> > b/libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> > new file mode 100644
> > index 00..0b37a8f64b
> > --- /dev/null
> > +++ b/libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> > @@ -0,0 +1,35 @@
> > +/*
> > + * Copyright (c) 2018 Sergey Lavrushkin
> []
> remove name here

Sorry, will modify in next patch version.

Thank you for your review.
Ting FU
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native support for avg_pool

2020-07-29 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Thursday, July 30, 2020 10:02 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native support for
> avg_pool
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Fu,
> > Ting
> > Sent: 2020年7月30日 9:43
> > To: FFmpeg development discussions and patches
> > 
> > Subject: Re: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native
> > support for avg_pool
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Ting Fu
> > > Sent: Wednesday, July 29, 2020 10:11 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Subject: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native
> > > support for avg_pool
> > >
> > > Not support pooling strides in channel dimension now.
> > > It can be tested with the model generated with below python script:
> > >
> > > import tensorflow as tf
> > > import numpy as np
> > > import imageio
> > >
> > > in_img = imageio.imread('input_odd.jpg') in_img =
> > > in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :]
> > >
> > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3],
> > > name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1],
> > > strides=[1,2,2,1], padding='SAME') #please alter the params as
> > > needed y = tf.identity(x_pool, name='dnn_out')
> > >
> > > sess=tf.Session()
> > > sess.run(tf.global_variables_initializer())
> > >
> > > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > > sess.graph_def,
> > > ['dnn_out']) tf.train.write_graph(graph_def, '.',
> > > 'image_process.pb',
> > > as_text=False)
> > >
> > > print("image_process.pb generated, please use \
> > > path_to_ffmpeg/tools/python/convert.py to generate
> > > image_process.model\n")
> > >
> > > output = sess.run(y, feed_dict={x: in_data})
> > > imageio.imsave("out.jpg",
> > > np.squeeze(output))
> > >
> > > Signed-off-by: Ting Fu 
> > > ---
> > >  libavfilter/dnn/Makefile  |   1 +
> > >  libavfilter/dnn/dnn_backend_native.h  |   2 +
> > >  .../dnn/dnn_backend_native_layer_avgpool.c| 147
> > ++
> > >  .../dnn/dnn_backend_native_layer_avgpool.h|  35 +
> > >  .../dnn/dnn_backend_native_layer_conv2d.h |   3 +-
> > >  libavfilter/dnn/dnn_backend_native_layers.c   |   2 +
> > >  tools/python/convert_from_tensorflow.py   |  35 -
> > >  7 files changed, 222 insertions(+), 3 deletions(-)  create mode
> > > 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c
> > >  create mode 100644
> > > libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> > >
> > > diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> > > index d90137ec42..e0957073ee 100644
> > > --- a/libavfilter/dnn/Makefile
> > > +++ b/libavfilter/dnn/Makefile
> > > @@ -1,6 +1,7 @@
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_interface.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layers.o
> > > +OBJS-$(CONFIG_DNN)   +=
> > > dnn/dnn_backend_native_layer_avgpool.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layer_pad.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > > dnn/dnn_backend_native_layer_conv2d.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > > dnn/dnn_backend_native_layer_depth2space.o
> > [...]
> > >
> > >
> > > +def dump_avg_pool_to_file(self, node, f):
> > > +assert(node.op == 'AvgPool')
> > > +self.layer_number = self.layer_number + 1
> > > +self.converted_nodes.add(node.name)
> > > +node0 = self.name_node_dict[node.input[0]]
> > > +strides = node.attr['strides']
> > > +assert(strides.list.i[1]==strides.list.i[2])
> > > +assert(strides.list.i[0]==1)
> > > +assert(strides.list.i[3]==1)
> >
> > Since the tensorflow do not support pooling strides in batch
> > dimension, and current do not support pooling in channel dimension, added
> two assert here.
> 
> thanks, and please add the comments within the code.

Thank you Yejun,
if no further comments, this would be the only difference in patch V3.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native support for avg_pool

2020-07-29 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting Fu
> Sent: Wednesday, July 29, 2020 10:11 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V2 1/2] dnn/native: add native support for
> avg_pool
> 
> Not support pooling strides in channel dimension now.
> It can be tested with the model generated with below python script:
> 
> import tensorflow as tf
> import numpy as np
> import imageio
> 
> in_img = imageio.imread('input_odd.jpg') in_img =
> in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :]
> 
> x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in') x_pool
> = tf.nn.avg_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') 
> #please
> alter the params as needed y = tf.identity(x_pool, name='dnn_out')
> 
> sess=tf.Session()
> sess.run(tf.global_variables_initializer())
> 
> graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def,
> ['dnn_out']) tf.train.write_graph(graph_def, '.', 'image_process.pb',
> as_text=False)
> 
> print("image_process.pb generated, please use \
> path_to_ffmpeg/tools/python/convert.py to generate image_process.model\n")
> 
> output = sess.run(y, feed_dict={x: in_data}) imageio.imsave("out.jpg",
> np.squeeze(output))
> 
> Signed-off-by: Ting Fu 
> ---
>  libavfilter/dnn/Makefile  |   1 +
>  libavfilter/dnn/dnn_backend_native.h  |   2 +
>  .../dnn/dnn_backend_native_layer_avgpool.c| 147 ++
>  .../dnn/dnn_backend_native_layer_avgpool.h|  35 +
>  .../dnn/dnn_backend_native_layer_conv2d.h |   3 +-
>  libavfilter/dnn/dnn_backend_native_layers.c   |   2 +
>  tools/python/convert_from_tensorflow.py   |  35 -
>  7 files changed, 222 insertions(+), 3 deletions(-)  create mode 100644
> libavfilter/dnn/dnn_backend_native_layer_avgpool.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> 
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index
> d90137ec42..e0957073ee 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -1,6 +1,7 @@
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layers.o
> +OBJS-$(CONFIG_DNN)   +=
> dnn/dnn_backend_native_layer_avgpool.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
>  OBJS-$(CONFIG_DNN)   +=
> dnn/dnn_backend_native_layer_conv2d.o
>  OBJS-$(CONFIG_DNN)   +=
> dnn/dnn_backend_native_layer_depth2space.o
[...]
> 
> 
> +def dump_avg_pool_to_file(self, node, f):
> +assert(node.op == 'AvgPool')
> +self.layer_number = self.layer_number + 1
> +self.converted_nodes.add(node.name)
> +node0 = self.name_node_dict[node.input[0]]
> +strides = node.attr['strides']
> +assert(strides.list.i[1]==strides.list.i[2])
> +assert(strides.list.i[0]==1)
> +assert(strides.list.i[3]==1)

Since the tensorflow do not support pooling strides in batch dimension, and 
current do not support pooling in channel dimension,
added two assert here.

> +strides = strides.list.i[1]
> +filter_node = node.attr['ksize']
> +input_name = node.input[0]
> +
> +assert(filter_node.list.i[0]==1)
> +assert(filter_node.list.i[3]==1)

Same as above, the tensorflow do not support pooling ksize in both batch 
dimension and channel dimension.

> +filter_height = filter_node.list.i[1]
> +filter_width = filter_node.list.i[2]
> +
> +in_channels = node0.attr['shape'].shape.dim[3].size
> +out_channels = in_channels
> +padding = node.attr['padding'].s.decode("utf-8")
> +np.array([self.op2code[node.op], strides, 
> self.pool_paddings[padding],
> in_channels, out_channels,
> +  filter_height],dtype=np.uint32).tofile(f)
> +
> +input_operand_index = self.add_operand(input_name,
> Operand.IOTYPE_INPUT)
> +output_operand_index = self.add_operand(node.name,
> Operand.IOTYPE_OUTPUT)
> +np.array([input_operand_index,
> + output_operand_index],dtype=np.uint32).tofile(f)
> +
> +
>  def dump_layers_to_file(self, f):
>  for node in self.nodes:
>  if node.name in self.converted_nodes:
> @@ -311,6 +342,8 @@ class TFConverter:
> 
>  if node.op == 'Conv2D':
>  self.dump_simple_conv2d_to_file(node, f)
> +if node.op == 'AvgPool':
> +self.dump_avg_pool_to_file(node, f)
>  elif node.op == 'DepthToSpace':
>  self.dump_depth2space_to_file(node, f)
>  elif node.op == 'MirrorPad':
> --
> 2.17.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
>

Re: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for avg_pool

2020-07-20 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Monday, July 20, 2020 01:46 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for
> avg_pool
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: 2020年7月17日 23:23
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for
> > avg_pool
> >
> > It can be tested with the model generated with below python script:
> >
> > import tensorflow as tf
> > import numpy as np
> > import imageio
> >
> > in_img = imageio.imread('input_odd.jpg') in_img =
> > in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :]
> >
> > x = tf.placeholder(tf.float32, shape=[1, None, None, 3],
> > name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1],
> > strides=[1,2,2,1], padding='SAME') #please alter the params as needed
> > y = tf.identity(x_pool, name='dnn_out')
> >
> > sess=tf.Session()
> > sess.run(tf.global_variables_initializer())
> >
> > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > sess.graph_def,
> > ['dnn_out'])
> > tf.train.write_graph(graph_def, '.', 'image_process.pb',
> > as_text=False)
> >
> > print("image_process.pb generated, please use \
> > path_to_ffmpeg/tools/python/convert.py to generate
> > image_process.model\n")
> >
> > output = sess.run(y, feed_dict={x: in_data}) imageio.imsave("out.jpg",
> > np.squeeze(output))
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/dnn/Makefile  |   1 +
> >  libavfilter/dnn/dnn_backend_native.h  |   2 +
> >  .../dnn/dnn_backend_native_layer_avgpool.c| 136 ++
> >  .../dnn/dnn_backend_native_layer_avgpool.h|  35 +
> >  .../dnn/dnn_backend_native_layer_conv2d.h |   3 +-
> >  libavfilter/dnn/dnn_backend_native_layers.c   |   2 +
> >  tools/python/convert_from_tensorflow.py   |  31 +++-
> >  7 files changed, 207 insertions(+), 3 deletions(-)  create mode
> > 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c
> >  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h
> >
[...]
> > +int32_t input_operand_index = input_operand_indexes[0];
> > +int number = operands[input_operand_index].dims[0];
> > +int height = operands[input_operand_index].dims[1];
> > +int width = operands[input_operand_index].dims[2];
> > +int channel = operands[input_operand_index].dims[3];
> 
> the input channel should come from here, not in AvgPoolParams.
> And so as output channel.

HI Yejun,

I got it that the in_channel should come from here. Does the 'so as output 
channel' mean out_channel = in_channel here (since the pooling of channel is 
not supported)?

> 
> > +const float *input = operands[input_operand_index].data;
> > +const AvgPoolParams *avgpool_params = (const AvgPoolParams
> > *)parameters;
> > +
> > +float kernel_strides = avgpool_params->strides;
> 
> why float?

In order to calculate height/kernel_strides with float output in following 
ceil(). Or should I multiply kernel_strides with 1.0  when using ceil function?

> 
> > +int src_linesize = width * avgpool_params->in_channels;
> > +DnnOperand *output_operand = [output_operand_index];
> > +
> > +if (avgpool_params->padding_method == SAME) {
> > +height_end = height;
> > +width_end = width;
> > +height_radius = (avgpool_params->kernel_size - ((height - 1)
> > + % (int)
> > kernel_strides + 1));
> 
> don't need the first '(' and last ')'.

OK

> 
> why we need to consider kernel_strides here?

Because when padding_method=SAME, the tensorflow will only padding the half 
number of 0 pixels except the remainders.
Eg: if the width is 1080, strides=11, so the 1080%11=2
And if ksize=5, it will fill (5-2)>>1=1 column before image and 
2 columns after the image.
And if ksize=2, so 2-2=0, so the remainder pixels just meet the 
need of calculating one time pooling, so no 0 pixels will be filled.
Which means the numbers of filling 0-pixels rely on the remainder-pixels.
Does the example make any sense?

> 
> > +width_radius = (avgpool_params->kernel_size - ((width - 1) %
> > + (int)
> > kernel_strides + 1));
> 
> same as above.
> 
> > +height_radius = height_radius < 0 ? 0 : height_radius >> 1;
> > +width_radius = width_radius < 0 ? 0 : width_radius >> 1;
[...]
> > +for (int y = 0; y < height_end; y += kernel_strides) {
> > +for (int x = 0; x < width_end; x += kernel_strides) {
> > +for (int n_filter = 0; n_filter <
> > + avgpool_params->out_channels;
> > ++n_filter) {
> []
> better to use n_channel, instead of n_filter.

Sure

> 
> > +output[n_filter] = 0.0;
> > +kernel_area = 0;
[...]
> > +def dump_avg_pool_to_file(self, node, f):
> > +assert(node.op == 'AvgPool')
> > +

Re: [FFmpeg-devel] [PATCH] tests/dnn/mathunary: fix the issue of NAN

2020-07-07 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Guo,
> Yejun
> Sent: Wednesday, July 8, 2020 10:39 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] tests/dnn/mathunary: fix the issue of NAN
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: 2020年7月2日 21:51
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH] tests/dnn/mathunary: fix the issue of
> > NAN
> >
> > When one of output[i] & expected_output is NAN, the unit test will always
> pass.
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  tests/dnn/dnn-layer-mathunary-test.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/dnn/dnn-layer-mathunary-test.c
> > b/tests/dnn/dnn-layer-mathunary-test.c
> > index bf77c44bbe..f251447771 100644
> > --- a/tests/dnn/dnn-layer-mathunary-test.c
> > +++ b/tests/dnn/dnn-layer-mathunary-test.c
> > @@ -74,7 +74,8 @@ static int test(DNNMathUnaryOperation op)
> >  output = operands[1].data;
> >  for (int i = 0; i < sizeof(input) / sizeof(float); ++i) {
> >  float expected_output = get_expected(input[i], op);
> > -if(fabs(output[i] - expected_output) > EPS) {
> > +if ((isnan(output[i]) ^ isnan(expected_output)) ||
> > +fabs(output[i] - expected_output) > EPS) {
> 
> it's possible that different platform handles NaN slightly different.
> my suggestion is to describe it simply/clearly to avoid possible issue.
> 
> for example.
> A: isnan(output[i]);
> B: isnan(expected_output);
> C: fabs(output[i] - expected_output) > EPS if ( (A&&!B) || (!A&) || (!A && 
> !B
> && C) )
> 
Hi Yejun,

Thank you for your review. Modified in PATCH V2. The verify logic is as below:
If ( (!A && !B && (ABS_SUB> EPS)) || (A && !B) || (!A && B)).

Thank you
Ting FU
> 
> >  printf("at index %d, output: %f, expected_output: %f\n",
> > i, output[i], expected_output);
> >  av_freep();
> >  return 1;
> > --
> > 2.17.1
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Fix Segmentation Fault when load unaligned data

2020-02-25 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Carl
> Eugen Hoyos
> Sent: Tuesday, February 25, 2020 05:43 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Fix Segmentation
> Fault when load unaligned data
> 
> 
> 
> > Am 25.02.2020 um 07:29 schrieb Ting Fu :
> >
> > Signed-off-by: Ting Fu 
> > ---
> > libswscale/x86/yuv_2_rgb.asm | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/libswscale/x86/yuv_2_rgb.asm
> > b/libswscale/x86/yuv_2_rgb.asm index e05bbb89f5..575a84d921 100644
> > --- a/libswscale/x86/yuv_2_rgb.asm
> > +++ b/libswscale/x86/yuv_2_rgb.asm
> > @@ -139,7 +139,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num,
> reg_num, parameters
> > VBROADCASTSD vr_coff,  [pointer_c_ditherq + 4  * 8] %endif %endif
> > -mova m_y, [py_2indexq + 2 * indexq]
> > +movu m_y, [py_2indexq + 2 * indexq]
> > movh m_u, [pu_indexq  + indexq]
> > movh m_v, [pv_indexq  + indexq]
> > .loop0:
> > @@ -347,7 +347,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num,
> reg_num,
> > parameters %endif ; PACK RGB15/16 %endif ; PACK RGB15/16/32
> >
> > -mova m_y, [py_2indexq + 2 * indexq + 8 * time_num]
> > +movu m_y, [py_2indexq + 2 * indexq + 8 * time_num]
> 
> If there is a related ticket in trac, please mention it in the commit message.
> 
> Carl Eugen

Sorry for the missing ticket info. Added in patch V2.

Thank you,
Ting Fu
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2020-02-10 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Fu,
> Ting
> Sent: Monday, February 10, 2020 05:54 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add
> ssse3 version
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> Paul
> > B Mahol
> > Sent: Monday, February 10, 2020 04:56 PM
> > To: FFmpeg development discussions and patches  > de...@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add
> > ssse3 version
> >
> > Does this pass fate?
> > If yes i will apply.
> 
> Hi Paul,
> 
> It has passed the fate.
> 
> Thank you,
> Ting Fu

Hi Paul,

After a further check, I found there may be no fate for yuv2rgb. I just run 
'make fate' with no error yesterday, sorry for that.
And I have checked the md5 of the output of ssse3 and mmx already (100 frames 
with and without apply my patch). They are the same. Here are examples with 5 
frames below.
I think this can also prove that the ssse3 outputs the same with mmx.

Command:  ./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv -vcodec 
rawvideo -s 1920*1080 -pix_fmt FMT -vframes 5 -f framemd5 FMT.md5 (FMT = 
{RGB24, BGR24, RGB32, BGR32, RGB555, RGB565})
YUV420P to RGB24
mmx:
2c891b51088497776cd85ab8bdf7d9c0
fa8df95a630096a96ee06048b446f7cd
21048f651cfa5e5309b7838d119070bd
3149361521068d00862ec9f88224c92d
7790cd3d7a966546a6c245738dc7430b
ssse3:
2c891b51088497776cd85ab8bdf7d9c0
fa8df95a630096a96ee06048b446f7cd
21048f651cfa5e5309b7838d119070bd
3149361521068d00862ec9f88224c92d
7790cd3d7a966546a6c245738dc7430b
YUV420P to BGR24
mmx:
46f19629d8f34fe4dba3df99c49668fa
000e251995a0269dd0dce77c31711e98
edccc926fbee775b00e2a5da4a7365e1
d46e286699c0361fa21832c33c662120
a79321d438111dae575158aefca4c87f
ssse3:
46f19629d8f34fe4dba3df99c49668fa
000e251995a0269dd0dce77c31711e98
edccc926fbee775b00e2a5da4a7365e1
d46e286699c0361fa21832c33c662120
a79321d438111dae575158aefca4c87f
YUV420P to RGB32
mmx:
51e99fab05a84fb08c62ddde3898648e
4c078b41498d1fde73a8b636ba3c
c60448161ed3662c79445e24d5e6b834
4a4ae471a2e5448833fbbf41a7de5c42
0200b2a18f1212c5adb5d6f59b513503
ssse3:
51e99fab05a84fb08c62ddde3898648e
4c078b41498d1fde73a8b636ba3c
c60448161ed3662c79445e24d5e6b834
4a4ae471a2e5448833fbbf41a7de5c42
0200b2a18f1212c5adb5d6f59b513503
YUV420P to BGR32
mmx:
a494abb5a4fa9f51a228d13d2452cc87
0fb30a71f70eb977580968a96cefca43
77feb78a9355601d2d36dfe15a6a3993
b64b7717e3baaa012d98a019f64b55d8
1abca5233e66e0dca63c8476c08e830f
ssse3:
a494abb5a4fa9f51a228d13d2452cc87
0fb30a71f70eb977580968a96cefca43
77feb78a9355601d2d36dfe15a6a3993
b64b7717e3baaa012d98a019f64b55d8
1abca5233e66e0dca63c8476c08e830f
YUV420P to BGR555
mmx:
1874dea6f6f7afb872d748530c3e7116
af1f98567c1d7c3fa9705d08b7873231
2f4e21ab5dd6499c57db0932bbdc2498
4eccd6dcf69dba8e1e5365cad2cdcf84
0c20aaf1602522a012b1543f0cef7169
ssse3:
1874dea6f6f7afb872d748530c3e7116
af1f98567c1d7c3fa9705d08b7873231
2f4e21ab5dd6499c57db0932bbdc2498
4eccd6dcf69dba8e1e5365cad2cdcf84
0c20aaf1602522a012b1543f0cef7169
YUV420P to BGR565
mmx:
6cac48711756152b3ca815b6985b2882
93264a7683ff0761a8ef300590674e40
fc0242e2f7dacbcd9a5f9ac7229a8d9f
dbc5acb000fa5a9d468ab8db03431c39
af7a512c00c1b177d4f65e6d056ea97c
ssse3:
6cac48711756152b3ca815b6985b2882
93264a7683ff0761a8ef300590674e40
fc0242e2f7dacbcd9a5f9ac7229a8d9f
dbc5acb000fa5a9d468ab8db03431c39
af7a512c00c1b177d4f65e6d056ea97c

Command: ./ffmpeg -pix_fmt yuva420p -s 1920*1080 -i 
Allegro_HEVC_Main444_MT41_Funny_00_1920x1080_r2260.yuva420.yuv -vcodec rawvideo 
-s 1920*1080 -pix_fmt FMT -vframes 5 -f framemd5 FMT.md5 (FMT = {RGB32, BGR32}) 
YUVA420P to RGB32
mmx:
602cfb10a7eadebd7f85c8937ab43509
210ed4c10d021c754b2ab9a434b0f0c2
88eef228f50264c6340ae5490412a4aa
8fcda91158c4c26309da6f76cf79c4a7
6cd6d747ce91726b33c649df8b04f15c
ssse3:
602cfb10a7eadebd7f85c8937ab43509

Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2020-02-10 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Paul
> B Mahol
> Sent: Monday, February 10, 2020 04:56 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add
> ssse3 version
> 
> Does this pass fate?
> If yes i will apply.

Hi Paul,

It has passed the fate.

Thank you,
Ting Fu

> 
> On 2/10/20, Fu, Ting  wrote:
> >
> >
> >> -Original Message-
> >> From: ffmpeg-devel  On Behalf Of
> >> Ting Fu
> >> Sent: Sunday, January 19, 2020 11:51 AM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add
> >> ssse3 version
> >>
> >> Tested using this command:
> >> /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> >> rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >>
> >> The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @
> >> 3.70GHz
> >>
> >> Signed-off-by: Ting Fu 
> >> ---
> >>  libswscale/x86/yuv2rgb.c |  38 +
> >>  libswscale/x86/yuv_2_rgb.asm | 145
> +++--
> >> --
> >>
> > [...]
> >> +
> >> +INIT_XMM ssse3
> >> +yuv2rgb_fn yuv,  rgb, 24
> >> +yuv2rgb_fn yuv,  bgr, 24
> >> +yuv2rgb_fn yuv,  rgb, 32
> >> +yuv2rgb_fn yuv,  bgr, 32
> >> +yuv2rgb_fn yuva, rgb, 32
> >> +yuv2rgb_fn yuva, bgr, 32
> >> +yuv2rgb_fn yuv,  rgb, 15
> >> +yuv2rgb_fn yuv,  rgb, 16
> >> --
> >> 2.17.1
> >
> > Hi all,
> >
> > Any further comment for this single patch?
> > This version has only changed a little compared with V4 (reviewed by
> > Paul), which is that the check of SIMD has been moved from the wrapper
> > function (in
> > libswscale/x86/yuv2rgb_template.c) to format check (in
> > libswscale/x86/yuv2rgb.c).
> > While assembly code is exactly same with the V4.
> >
> > Thank you,
> > Ting Fu.
> >
> >>
> >> ___
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel@ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2020-02-10 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting
> Fu
> Sent: Sunday, January 19, 2020 11:51 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V8 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> Tested using this command:
> /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> 
> The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @
> 3.70GHz
> 
> Signed-off-by: Ting Fu 
> ---
>  libswscale/x86/yuv2rgb.c |  38 +
>  libswscale/x86/yuv_2_rgb.asm | 145 +++--
> --
>
[...]
> +
> +INIT_XMM ssse3
> +yuv2rgb_fn yuv,  rgb, 24
> +yuv2rgb_fn yuv,  bgr, 24
> +yuv2rgb_fn yuv,  rgb, 32
> +yuv2rgb_fn yuv,  bgr, 32
> +yuv2rgb_fn yuva, rgb, 32
> +yuv2rgb_fn yuva, bgr, 32
> +yuv2rgb_fn yuv,  rgb, 15
> +yuv2rgb_fn yuv,  rgb, 16
> --
> 2.17.1

Hi all,

Any further comment for this single patch?
This version has only changed a little compared with V4 (reviewed by Paul), 
which is that the check of SIMD has been moved from the wrapper function (in 
libswscale/x86/yuv2rgb_template.c) to format check (in 
libswscale/x86/yuv2rgb.c).
While assembly code is exactly same with the V4.

Thank you,
Ting Fu.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V8 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-02-04 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Tuesday, January 21, 2020 05:04 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V8 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Sun, Jan 19, 2020 at 11:51:03AM +0800, Ting Fu wrote:
> > The original inline assembly and nasm code have the same fps when called by
> command.
> > NASM code almost has no impact on the perfromance.
> >
> > Signed-off-by: Ting Fu 
> > ---
> > V8:
> > Remove all reindention to make review easier.
> > Fix some improper indention.
> > Reserve the "inline" for next patch.
> >
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  26 +-
> >  libswscale/x86/yuv2rgb_template.c | 392 +-
> >  libswscale/x86/yuv_2_rgb.asm  | 270 
> >  5 files changed, 351 insertions(+), 354 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> 
> Seems to work, i intend to apply this in a few days
> 
> thx

Ping.

Thank you.

> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> I do not agree with what you have to say, but I'll defend to the death your 
> right
> to say it. -- Voltaire
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-19 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ronald
> S. Bultje
> Sent: Sunday, January 19, 2020 10:55 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> Hi,
> 
> On Sat, Jan 18, 2020 at 9:49 PM Fu, Ting  wrote:
> 
> > Since NASM function will get only the address of SwsConext c ( in
> > order to be compatible with yuv2rgb_c function in parameters), not the
> > address of
> > c->redDither nor the c->dstW. I have no way to get the value of
> > c->c->dstW by
> > using address offset.
> 
> 
> Nasm and related variants have "struc" (like "struct" in C) for this.
Hi Ronald,

Thank you so much for the information. I believe it will be helpful in future 
patches.

Thank you,
Ting Fu
> 
> See for example this code:
> https://code.videolan.org/videolan/dav1d/blob/master/src/x86/film_grain.asm
> #L64
> 
> Hope this helps,
> Ronald
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-19 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Sunday, January 19, 2020 09:11 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Sun, Jan 19, 2020 at 02:49:21AM +, Fu, Ting wrote:
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Michael Niedermayer
> > > Sent: Friday, January 17, 2020 05:36 AM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > On Thu, Jan 16, 2020 at 07:27:05AM +, Fu, Ting wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: ffmpeg-devel  On Behalf
> > > > > Of Michael Niedermayer
> > > > > Sent: Wednesday, January 15, 2020 05:55 AM
> > > > > To: FFmpeg development discussions and patches
> > > > > 
> > > > > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > > > > Change inline assembly into nasm code
> > > > >
> > > > > On Fri, Jan 10, 2020 at 01:38:15AM +0800, Ting Fu wrote:
> > > > > > Signed-off-by: Ting Fu 
> > > > > > ---
> > > > > > V7:
> > > > > > Fix compile issue when user configure with --disable-mmx.
> > > > > > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > > > > > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> > > > > >
> > > > > >  libswscale/x86/Makefile   |   1 +
> > > > > >  libswscale/x86/swscale.c  |  16 +-
> > > > > >  libswscale/x86/yuv2rgb.c  |  66 ++---
> > > > > >  libswscale/x86/yuv2rgb_template.c | 467 
> > > > > > ++
> > > > > >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> > > > > >  5 files changed, 405 insertions(+), 415 deletions(-)  create
> > > > > > mode
> > > > > > 100644 libswscale/x86/yuv_2_rgb.asm
> > > > >
> > > > > The commit message seems a bit terse I think it should say if
> > > > > the sequence of instructions is unchanged and if it was
> > > > > benchmaked. If its the same speed, when the code is run the
> > > > > commit message should say that too
> > > > >
> > > > > the principle of this (inline -> nasm) is fine of course.
> > > > >
> > > > >
> > > > [...]
> > > > > > -static inline int RENAME(yuv420_rgb16)(SwsContext *c, const
> > > > > > uint8_t
> > > *src[],
> > > > > > -   int srcStride[],
> > > > > > -   int srcSliceY, int 
> > > > > > srcSliceH,
> > > > > > -   uint8_t *dst[], int 
> > > > > > dstStride[])
> > > > > > +static int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t 
> > > > > > *src[],
> > > > > > +   int srcStride[],
> > > > > > +   int srcSliceY, int 
> > > > > > srcSliceH,
> > > > > > +   uint8_t
> > > > > > +*dst[], int
> > > > > > +dstStride[])
> > > > >
> > > > > maybe the removial of inline should be a seperate patch also
> > > > > there is the question why these wraper functions exist These do
> > > > > change from a a "free thing in inline asm" to a call overhead
> > > > > with C->NASM
> > > > >
> > > > Hi Michael,
> > > >
> > > > The wrapper functions initiate some variables and contain one 'for
> > > > cycle'. The
> > > variable initiation needs to access to the 'c->dstW', furthermore
> > > macro SWS_MAX_ FILTER_SIZE is needed. Which means extra work and
> > > much more NASM code.
> > > > If you still prefer to do all the things in assembly, I can change from 
> > > > 'C-
> &

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-18 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, January 17, 2020 05:36 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Thu, Jan 16, 2020 at 07:27:05AM +, Fu, Ting wrote:
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Michael Niedermayer
> > > Sent: Wednesday, January 15, 2020 05:55 AM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > On Fri, Jan 10, 2020 at 01:38:15AM +0800, Ting Fu wrote:
> > > > Signed-off-by: Ting Fu 
> > > > ---
> > > > V7:
> > > > Fix compile issue when user configure with --disable-mmx.
> > > > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > > > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> > > >
> > > >  libswscale/x86/Makefile   |   1 +
> > > >  libswscale/x86/swscale.c  |  16 +-
> > > >  libswscale/x86/yuv2rgb.c  |  66 ++---
> > > >  libswscale/x86/yuv2rgb_template.c | 467 ++
> > > >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> > > >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > > > 100644 libswscale/x86/yuv_2_rgb.asm
> > >
> > > The commit message seems a bit terse I think it should say if the
> > > sequence of instructions is unchanged and if it was benchmaked. If
> > > its the same speed, when the code is run the commit message should
> > > say that too
> > >
> > > the principle of this (inline -> nasm) is fine of course.
> > >
> > >
> > [...]
> > > > -static inline int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t
> *src[],
> > > > -   int srcStride[],
> > > > -   int srcSliceY, int srcSliceH,
> > > > -   uint8_t *dst[], int dstStride[])
> > > > +static int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > > > +   int srcStride[],
> > > > +   int srcSliceY, int 
> > > > srcSliceH,
> > > > +   uint8_t *dst[],
> > > > +int
> > > > +dstStride[])
> > >
> > > maybe the removial of inline should be a seperate patch also there
> > > is the question why these wraper functions exist These do change
> > > from a a "free thing in inline asm" to a call overhead with C->NASM
> > >
> > Hi Michael,
> >
> > The wrapper functions initiate some variables and contain one 'for cycle'. 
> > The
> variable initiation needs to access to the 'c->dstW', furthermore macro
> SWS_MAX_ FILTER_SIZE is needed. Which means extra work and much more
> NASM code.
> > If you still prefer to do all the things in assembly, I can change from 
> > 'C->NASM'
> to 'call NASM function directly' in another further patch( for current patch 
> easier
> to review).
> > Or in my opinion, the cost in C->NASM can be ignored, and the initiation 
> > work
> looks clearer in C, just let it be what it is now.
> > What do you think?
> 
> it probably makes no sense if its hard to convert that code

Hi Michael,

You mean I still need to convert that code, did I get you right?
Since NASM function will get only the address of SwsConext c ( in order to be 
compatible with yuv2rgb_c function in parameters), not the address of 
c->redDither nor the c->dstW. I have no way to get the value of c->dstW by 
using address offset. 
Do you have any suggestion for solving that problem? 

Thank you,
Ting Fu
> 
> thx
> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> If you fake or manipulate statistics in a paper in physics you will never get 
> a job
> again.
> If you fake or manipulate statistics in a paper in medicin you will get a job 
> for life
> at the pharma industry.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-15 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Wednesday, January 15, 2020 05:55 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Fri, Jan 10, 2020 at 01:38:15AM +0800, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> > V7:
> > Fix compile issue when user configure with --disable-mmx.
> > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> >
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  66 ++---
> >  libswscale/x86/yuv2rgb_template.c | 467 ++
> >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> 
> The commit message seems a bit terse
> I think it should say if the sequence of instructions is unchanged and if it 
> was
> benchmaked. If its the same speed, when the code is run the commit message
> should say that too
> 
> the principle of this (inline -> nasm) is fine of course.
> 
> 
[...]
> > -static inline int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > -   int srcStride[],
> > -   int srcSliceY, int srcSliceH,
> > -   uint8_t *dst[], int dstStride[])
> > +static int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > +   int srcStride[],
> > +   int srcSliceY, int 
> > srcSliceH,
> > +   uint8_t *dst[], int
> > +dstStride[])
> 
> maybe the removial of inline should be a seperate patch also there is the
> question why these wraper functions exist These do change from a a "free thing
> in inline asm" to a call overhead with C->NASM
> 
Hi Michael,

The wrapper functions initiate some variables and contain one 'for cycle'. The 
variable initiation needs to access to the 'c->dstW', furthermore macro 
SWS_MAX_ FILTER_SIZE is needed. Which means extra work and much more NASM code.
If you still prefer to do all the things in assembly, I can change from 
'C->NASM' to 'call NASM function directly' in another further patch( for 
current patch easier to review).
Or in my opinion, the cost in C->NASM can be ignored, and the initiation work 
looks clearer in C, just let it be what it is now.
What do you think?

Thank you,
Ting Fu

[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-14 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Wednesday, January 15, 2020 05:55 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Fri, Jan 10, 2020 at 01:38:15AM +0800, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> > V7:
> > Fix compile issue when user configure with --disable-mmx.
> > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> >
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  66 ++---
> >  libswscale/x86/yuv2rgb_template.c | 467 ++
> >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> 
> The commit message seems a bit terse
> I think it should say if the sequence of instructions is unchanged and if it 
> was
> benchmaked. If its the same speed, when the code is run the commit message
> should say that too
> 
> the principle of this (inline -> nasm) is fine of course.
> 
Hi Michael,

Got it, will add more infos in next patch version.

> 
> >
[...]
> > -break;
> > -} else
> > -return yuv420_bgr32_mmx;
> > -case AV_PIX_FMT_RGB24:
> > -return yuv420_rgb24_mmx;
> > -case AV_PIX_FMT_BGR24:
> > -return yuv420_bgr24_mmx;
> > -case AV_PIX_FMT_RGB565:
> > -return yuv420_rgb16_mmx;
> > -case AV_PIX_FMT_RGB555:
> > -return yuv420_rgb15_mmx;
> > +break;
> > +} else
> > +return yuv420_bgr32_mmx;
> > +case AV_PIX_FMT_RGB24:
> > +return yuv420_rgb24_mmx;
> > +case AV_PIX_FMT_BGR24:
> > +return yuv420_bgr24_mmx;
> > +case AV_PIX_FMT_RGB565:
> > +return yuv420_rgb16_mmx;
> > +case AV_PIX_FMT_RGB555:
> > +return yuv420_rgb15_mmx;
> >  }
> >  }
> 
> this is a little messy to review
> it is mostly reindention
> yuv2rgb.c  |   66 +++
> and with -w
> yuv2rgb.c  |   26 +--
> 

All reindention will be removed.

> 
> 
> [...]
> > -static inline int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > -   int srcStride[],
> > -   int srcSliceY, int srcSliceH,
> > -   uint8_t *dst[], int dstStride[])
> > +static int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > +   int srcStride[],
> > +   int srcSliceY, int 
> > srcSliceH,
> > +   uint8_t *dst[], int
> > +dstStride[])
> 
> maybe the removial of inline should be a seperate patch also there is the
> question why these wraper functions exist These do change from a a "free thing
> in inline asm" to a call overhead with C->NASM

I will try to call nasm directly by removing the wrapper function.

> 
> 
> >  {
> >  int y, h_size, vshift;
> > -
> >  YUV2RGB_LOOP(2)
> >
> >  #ifdef DITHER1XBPP
> > -c->blueDither  = ff_dither8[y   & 1];
> > -c->greenDither = ff_dither4[y   & 1];
> > -c->redDither   = ff_dither8[(y + 1) & 1];
> > +c->blueDither  = ff_dither8[y   & 1];
> > +c->greenDither = ff_dither4[y   & 1];
> > +c->redDither   = ff_dither8[(y + 1) & 1];
> 
> these changes make the patch harder to review and the resulting commit harder
> to read too (and i manually matched these up above it lookes worse in the
> actual diff

Reindention will be removed.

> 
> 
> > -#endif
> > -
> > -YUV2RGB_INITIAL_LOAD
> > -YUV2RGB
> > -RGB_PACK_INTERLEAVE
> > -#ifdef DITHER1XBPP
> > -DITHER_RGB
> >  #endif
> > -RGB_PACK16(pb_07, 0)
> >
> > -YUV2RGB_ENDLOOP(2)
> > -YUV2RGB_OPERANDS
> > -YUV2RGB_ENDFUNC
> 
> > +RENAME(ff_yuv_420_rgb16)(index, image, pu - index, pv - index, &(c-
> >redDither), py - 2 * index);
> > +}
> > +return srcSliceH;
> 
> This doesnt look correctly indented

Will be changed.

Thank you,
Ting Fu
> 
> 
> thanks
> 
> 
> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> The smallest minority on earth is the individual. Those who deny individual 
> rights
> cannot claim to be defenders of minorities. - Ayn Rand
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-13 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Fu,
> Ting
> Sent: Tuesday, January 14, 2020 02:15 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Fu,
> > Ting
> > Sent: Friday, January 10, 2020 01:58 AM
> > To: FFmpeg development discussions and patches
> > 
> > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > Change inline assembly into nasm code
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Ting Fu
> > > Sent: Friday, January 10, 2020 01:38 AM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Subject: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > Signed-off-by: Ting Fu 
> > > ---
> > > V7:
> > > Fix compile issue when user configure with --disable-mmx.
> > > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> >
> > To be more detail. I was use 'if clause' to judge the color format in
> > libswscale/x86/yuv2rgb.c and then the '#if macro' to judge SIMD in
> > libswscale/x86/yuv2rgb_template.c. Which cannot correctly respond to
> > the command when use ./ffmpeg with --cpuflags, cause it does not get
> > value of
> > av_get_cpu_flags() any more. So, I abandoned the macro and judge both
> > color format and SIMD in libswscale/x86/yuv2rgb.c.
> >
> > Thank you,
> > Ting Fu
> > >
> > >  libswscale/x86/Makefile   |   1 +
> > >  libswscale/x86/swscale.c  |  16 +-
> > >  libswscale/x86/yuv2rgb.c  |  66 ++---
> > >  libswscale/x86/yuv2rgb_template.c | 467 ++
> > >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> > >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > > 100644 libswscale/x86/yuv_2_rgb.asm
> > >
> A kindle ping.

Sorry , I mean 'a kindly ping'.

Ting Fu
> [...]
> > > --
> > > 2.17.1
> > >
> > > ___
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel@ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-13 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Fu,
> Ting
> Sent: Friday, January 10, 2020 01:58 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Ting
> > Fu
> > Sent: Friday, January 10, 2020 01:38 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> > inline assembly into nasm code
> >
> > Signed-off-by: Ting Fu 
> > ---
> > V7:
> > Fix compile issue when user configure with --disable-mmx.
> > Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> 
> To be more detail. I was use 'if clause' to judge the color format in
> libswscale/x86/yuv2rgb.c and then the '#if macro' to judge SIMD in
> libswscale/x86/yuv2rgb_template.c. Which cannot correctly respond to the
> command when use ./ffmpeg with --cpuflags, cause it does not get value of
> av_get_cpu_flags() any more. So, I abandoned the macro and judge both color
> format and SIMD in libswscale/x86/yuv2rgb.c.
> 
> Thank you,
> Ting Fu
> >
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  66 ++---
> >  libswscale/x86/yuv2rgb_template.c | 467 ++
> >  libswscale/x86/yuv_2_rgb.asm  | 270 +
> >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> >
A kindle ping.
[...]
> > --
> > 2.17.1
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-09 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting
> Fu
> Sent: Friday, January 10, 2020 01:38 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> Signed-off-by: Ting Fu 
> ---
> V7:
> Fix compile issue when user configure with --disable-mmx.
> Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c

To be more detail. I was use 'if clause' to judge the color format in 
libswscale/x86/yuv2rgb.c and then the '#if macro' to judge SIMD in 
libswscale/x86/yuv2rgb_template.c. Which cannot correctly respond to the 
command when use ./ffmpeg with --cpuflags, cause it does not get value of 
av_get_cpu_flags() any more. So, I abandoned the macro and judge both color 
format and SIMD in libswscale/x86/yuv2rgb.c.

Thank you,
Ting Fu
> 
>  libswscale/x86/Makefile   |   1 +
>  libswscale/x86/swscale.c  |  16 +-
>  libswscale/x86/yuv2rgb.c  |  66 ++---
>  libswscale/x86/yuv2rgb_template.c | 467 ++
>  libswscale/x86/yuv_2_rgb.asm  | 270 +
>  5 files changed, 405 insertions(+), 415 deletions(-)  create mode 100644
> libswscale/x86/yuv_2_rgb.asm
> 
> diff --git a/libswscale/x86/Makefile b/libswscale/x86/Makefile index
> f317d5dd9b..831d5359aa 100644
> --- a/libswscale/x86/Makefile
> +++ b/libswscale/x86/Makefile
> @@ -12,3 +12,4 @@ X86ASM-OBJS += x86/input.o  
> \
> x86/output.o \
> x86/scale.o  \
> x86/rgb_2_rgb.o  \
> +   x86/yuv_2_rgb.o  \
> diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index
> 0eed4f18d5..e9d474a1e8 100644
> --- a/libswscale/x86/swscale.c
> +++ b/libswscale/x86/swscale.c
> @@ -29,6 +29,14 @@
>  #include "libavutil/cpu.h"
>  #include "libavutil/pixdesc.h"
> 
> +const DECLARE_ALIGNED(8, uint64_t, ff_dither4)[2] = {
> +0x0103010301030103LL,
> +0x0200020002000200LL,};
> +
> +const DECLARE_ALIGNED(8, uint64_t, ff_dither8)[2] = {
> +0x0602060206020602LL,
> +0x0004000400040004LL,};
> +
>  #if HAVE_INLINE_ASM
> 
>  #define DITHER1XBPP
> @@ -38,14 +46,6 @@ DECLARE_ASM_CONST(8, uint64_t, bFC)=
> 0xFCFCFCFCFCFCFCFCLL;
>  DECLARE_ASM_CONST(8, uint64_t, w10)=   0x0010001000100010LL;
>  DECLARE_ASM_CONST(8, uint64_t, w02)=   0x0002000200020002LL;
> 
> -const DECLARE_ALIGNED(8, uint64_t, ff_dither4)[2] = {
> -0x0103010301030103LL,
> -0x0200020002000200LL,};
> -
> -const DECLARE_ALIGNED(8, uint64_t, ff_dither8)[2] = {
> -0x0602060206020602LL,
> -0x0004000400040004LL,};
> -
>  DECLARE_ASM_CONST(8, uint64_t, b16Mask)=   0x001F001F001F001FLL;
>  DECLARE_ASM_CONST(8, uint64_t, g16Mask)=   0x07E007E007E007E0LL;
>  DECLARE_ASM_CONST(8, uint64_t, r16Mask)=   0xF800F800F800F800LL;
> diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c index
> 5e2f77c20f..dd813d4deb 100644
> --- a/libswscale/x86/yuv2rgb.c
> +++ b/libswscale/x86/yuv2rgb.c
> @@ -37,7 +37,7 @@
>  #include "libavutil/x86/cpu.h"
>  #include "libavutil/cpu.h"
> 
> -#if HAVE_INLINE_ASM
> +#if HAVE_X86ASM
> 
>  #define DITHER1XBPP // only for MMX
> 
> @@ -50,32 +50,31 @@ DECLARE_ASM_CONST(8, uint64_t, pb_03) =
> 0x0303030303030303ULL;  DECLARE_ASM_CONST(8, uint64_t, pb_07) =
> 0x0707070707070707ULL;
> 
>  //MMX versions
> -#if HAVE_MMX_INLINE && HAVE_6REGS
> +#if HAVE_MMX
>  #undef RENAME
>  #undef COMPILE_TEMPLATE_MMXEXT
>  #define COMPILE_TEMPLATE_MMXEXT 0
>  #define RENAME(a) a ## _mmx
>  #include "yuv2rgb_template.c"
> -#endif /* HAVE_MMX_INLINE && HAVE_6REGS */
> +#endif /* HAVE_MMX */
> 
>  // MMXEXT versions
> -#if HAVE_MMXEXT_INLINE && HAVE_6REGS
> +#if HAVE_MMXEXT
>  #undef RENAME
>  #undef COMPILE_TEMPLATE_MMXEXT
>  #define COMPILE_TEMPLATE_MMXEXT 1
>  #define RENAME(a) a ## _mmxext
>  #include "yuv2rgb_template.c"
> -#endif /* HAVE_MMXEXT_INLINE && HAVE_6REGS */
> +#endif /* HAVE_MMXEXT */
> 
> -#endif /* HAVE_INLINE_ASM */
> +#endif /* HAVE_X86ASM */
> 
>  av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)  { -#if
> HAVE_MMX_INLINE && HAVE_6REGS
> +#if HAVE_X86ASM
>  int cpu_flags = av_get_cpu_flags();
> 
> -#if HAVE_MMXEXT_INLINE
> -if (INLINE_MMXEXT(cpu_flags)) {
> +if (EXTERNAL_MMXEXT(cpu_flags)) {
>  switch (c->dstFormat) {
>  case AV_PIX_FMT_RGB24:
>  return yuv420_rgb24_mmxext; @@ -83,37 +82,36 @@ av_cold
> SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
>  return yuv420_bgr24_mmxext;
>  }
>  }
> -#endif
> 
> -if (INLINE_MMX(cpu_flags)) {
> +if (EXTERNAL_MMX(cpu_flags)) {
>  switch (c->dstFormat) {
> -case AV_PIX_FMT_RGB32:
> -if (c->srcFormat ==

Re: [FFmpeg-devel] [PATCH V6 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2020-01-09 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Thursday, January 9, 2020 06:17 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V6 2/2] libswscale/x86/yuv2rgb: add
> ssse3 version
> 
> On Wed, Jan 08, 2020 at 10:25:59AM +0800, Ting Fu wrote:
> > Tested using this command:
> > /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> > rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >
> > The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @
> > 3.70GHz
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/yuv2rgb.c  |   7 +-
> >  libswscale/x86/yuv2rgb_template.c |  58 +++-
> >  libswscale/x86/yuv_2_rgb.asm  | 145 ++
> >  3 files changed, 191 insertions(+), 19 deletions(-)
> >
> > diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c index
> > f3d2bb526e..7015266a7e 100644
> > --- a/libswscale/x86/yuv2rgb.c
> > +++ b/libswscale/x86/yuv2rgb.c
> > @@ -61,13 +61,18 @@ DECLARE_ASM_CONST(8, uint64_t, pb_07) =
> > 0x0707070707070707ULL;  #define COMPILE_TEMPLATE_MMXEXT 1  #endif
> /*
> > HAVE_MMXEXT */
> >
> > +//SSSE3 versions
> > +#if HAVE_SSSE3
> > +#define COMPILE_TEMPLATE_SSSE3 1
> > +#endif
> > +
> >  #include "yuv2rgb_template.c"
> >
> >  av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)  {
> >  int cpu_flags = av_get_cpu_flags();
> >
> > -if (EXTERNAL_MMX(cpu_flags)) {
> > +if (EXTERNAL_MMX(cpu_flags) || EXTERNAL_SSSE3(cpu_flags)) {
> 
> I would expect that EXTERNAL_SSSE3 implies EXTERNAL_MMX

Hi Michael,

You are right, the the EXTERNAL_SSSE3 does imply EXTERNAL_MMX.
While, I found two issues when thinking what you said.
And I have adjusted the if clause to solve them in PATCH V7.

Thank you
Ting Fu
[...]

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V5 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2020-01-07 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Tuesday, January 7, 2020 09:09 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V5 2/2] libswscale/x86/yuv2rgb: add
> ssse3 version
> 
> On Mon, Jan 06, 2020 at 03:28:47PM +0800, Ting Fu wrote:
> > Tested using this command:
> > /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> > rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >
> 
> > The fps increase from 389 to 640 on my local machine.
> 
> Please include information about the machine, mainly CPU in the commit
> message

Hi Michael,

My CPU is Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz.
I will only add it in next version patch 2/2 commit, not in patch 1/2 since I 
deleted the fps info in 1/2. Would that be ok?

Thank you,
Ting Fu

> 
> thx
> 
> 
> [...]
> --
> Michael GnuPG fingerprint:
> 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> If you think the mosad wants you dead since a long time then you are either
> wrong or dead since a long time.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-05 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, January 3, 2020 04:36 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Fri, Jan 03, 2020 at 06:59:28AM +, Fu, Ting wrote:
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Michael Niedermayer
> > > Sent: Friday, December 27, 2019 07:38 PM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > On Thu, Dec 19, 2019 at 11:35:51AM +0800, Ting Fu wrote:
> > > > Tested using this command:
> > > > ./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \
> > > > -vcodec rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> > > >
> > >
> > > > The fps increase from 151 to 389 on my local machine.
> > >
> > > Thats nice but why is there such a difference from changing the way
> > > the code is assembled ?
> > > This should definitly be explained more detailedly in the commit
> > > message
> > >
> > Hi, Michael
> >
> > The fps increasing means mmx compared to C code, not inline compared nasm
> one. I will remove it from the commit message next patch version.
> 
> please test apples against apples, a benchmark of the inline vs NASM code
> certainly cannot hurt. Testing unoptimized vs new optimized code is not
> interresting. Testing old optimized vs new optimized code is interresting
> 

Hi Michael,
As I tested, the nasm-style code is just has the same performance with inline 
assembly, which is 352 fps this time.
So, I just remove it from the commit.

> 
> >
> > >
> > > >
> > > > Signed-off-by: Ting Fu 
> > > > ---
> > > >  libswscale/x86/Makefile   |   1 +
> > > >  libswscale/x86/swscale.c  |  16 +-
> > > >  libswscale/x86/yuv2rgb.c  |  81 +++---
> > > >  libswscale/x86/yuv2rgb_template.c | 441 ++
> > > >  libswscale/x86/yuv_2_rgb.asm  | 270 ++
> > > >  5 files changed, 395 insertions(+), 414 deletions(-)  create mode
[...]
> > >
> > > i would expect EXTERNAL_MMXEXT to imply EXTERNAL_MMX
> > >
> >
> > I was thinking the mmx-only processor. Under this circumstance, the mmx-
> only processor will not be accelerated. Should that be OK? Or it means I will 
> be
> OK for not care much about old mmx-only processor in following patches?
> 
> no
> If MMXEXT implies MMX then MMX == (MMX || MMXEXT)
> 

Modified in new patch v5.

Thank you
Ting Fu

> [...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2020-01-02 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, December 27, 2019 07:38 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Thu, Dec 19, 2019 at 11:35:51AM +0800, Ting Fu wrote:
> > Tested using this command:
> > ./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> > rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >
> 
> > The fps increase from 151 to 389 on my local machine.
> 
> Thats nice but why is there such a difference from changing the way the code 
> is
> assembled ?
> This should definitly be explained more detailedly in the commit message
> 
Hi, Michael

The fps increasing means mmx compared to C code, not inline compared nasm one. 
I will remove it from the commit message next patch version.

> 
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  81 +++---
> >  libswscale/x86/yuv2rgb_template.c | 441 ++
> >  libswscale/x86/yuv_2_rgb.asm  | 270 ++
> >  5 files changed, 395 insertions(+), 414 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> >
> > diff --git a/libswscale/x86/Makefile b/libswscale/x86/Makefile index
> > f317d5dd9b..831d5359aa 100644
> > --- a/libswscale/x86/Makefile
> > +++ b/libswscale/x86/Makefile
> > @@ -12,3 +12,4 @@ X86ASM-OBJS += x86/input.o
> >   \
> > x86/output.o \
> > x86/scale.o  \
> > x86/rgb_2_rgb.o  \
> > +   x86/yuv_2_rgb.o  \
> > diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index
> > 0eed4f18d5..e9d474a1e8 100644
> > --- a/libswscale/x86/swscale.c
> > +++ b/libswscale/x86/swscale.c
> > @@ -29,6 +29,14 @@
> >  #include "libavutil/cpu.h"
> >  #include "libavutil/pixdesc.h"
> >
> > +const DECLARE_ALIGNED(8, uint64_t, ff_dither4)[2] = {
> > +0x0103010301030103LL,
> > +0x0200020002000200LL,};
> > +
> > +const DECLARE_ALIGNED(8, uint64_t, ff_dither8)[2] = {
> > +0x0602060206020602LL,
> > +0x0004000400040004LL,};
> > +
> >  #if HAVE_INLINE_ASM
> >
> >  #define DITHER1XBPP
> > @@ -38,14 +46,6 @@ DECLARE_ASM_CONST(8, uint64_t, bFC)=
> 0xFCFCFCFCFCFCFCFCLL;
> >  DECLARE_ASM_CONST(8, uint64_t, w10)=   0x0010001000100010LL;
> >  DECLARE_ASM_CONST(8, uint64_t, w02)=   0x0002000200020002LL;
> >
> > -const DECLARE_ALIGNED(8, uint64_t, ff_dither4)[2] = {
> > -0x0103010301030103LL,
> > -0x0200020002000200LL,};
> > -
> > -const DECLARE_ALIGNED(8, uint64_t, ff_dither8)[2] = {
> > -0x0602060206020602LL,
> > -0x0004000400040004LL,};
> > -
> >  DECLARE_ASM_CONST(8, uint64_t, b16Mask)=   0x001F001F001F001FLL;
> >  DECLARE_ASM_CONST(8, uint64_t, g16Mask)=   0x07E007E007E007E0LL;
> >  DECLARE_ASM_CONST(8, uint64_t, r16Mask)=   0xF800F800F800F800LL;
> > diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c index
> > 5e2f77c20f..ed9b613cab 100644
> > --- a/libswscale/x86/yuv2rgb.c
> > +++ b/libswscale/x86/yuv2rgb.c
> > @@ -37,7 +37,7 @@
> >  #include "libavutil/x86/cpu.h"
> >  #include "libavutil/cpu.h"
> >
> > -#if HAVE_INLINE_ASM
> > +#if HAVE_X86ASM
> >
> >  #define DITHER1XBPP // only for MMX
> >
> > @@ -50,70 +50,51 @@ DECLARE_ASM_CONST(8, uint64_t, pb_03) =
> > 0x0303030303030303ULL;  DECLARE_ASM_CONST(8, uint64_t, pb_07) =
> > 0x0707070707070707ULL;
> >
> >  //MMX versions
> > -#if HAVE_MMX_INLINE && HAVE_6REGS
> > -#undef RENAME
> > +#if HAVE_MMX
> >  #undef COMPILE_TEMPLATE_MMXEXT
> >  #define COMPILE_TEMPLATE_MMXEXT 0
> > -#define RENAME(a) a ## _mmx
> > -#include "yuv2rgb_template.c"
> > -#endif /* HAVE_MMX_INLINE && HAVE_6REGS */
> > +#endif /* HAVE_MMX */
> >
> >  // MMXEXT versions
> > -#if HAVE_MMXEXT_INLINE && HAVE_6REGS
> > -#undef RENAME
> > +#if HAVE_MMXEXT
> >  #undef COMPILE_TEMPLATE_MMXEXT
> >  #define COMPILE_TEMPLATE_MMXEXT 1
> > -#define RENAME(a) a ## _mmxext
> > -#include "yuv2rgb_template.c"
> > -#endif /* HAVE_MMXEXT_INLINE && HAVE_6REGS */
> > +#endif /* HAVE_MMXEXT */
> >
> > -#endif /* HAVE_INLINE_ASM */
> > +#include "yuv2rgb_template.c"
> >
> >  av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)  { -#if
> > HAVE_MMX_INLINE && HAVE_6REGS
> >  int cpu_flags = av_get_cpu_flags();
> >
> > -#if HAVE_MMXEXT_INLINE
> > -if (INLINE_MMXEXT(cpu_flags)) {
> > -switch (c->dstFormat) {
> > -case AV_PIX_FMT_RGB24:
> > -return yuv420_rgb24_mmxext;
> > -case AV_PIX_FMT_BGR24:
> > -return yuv420_bgr24_mmxext;
> > -}
> > -}
> > -#endif
> > -
> > -if (INLINE_MMX(cpu_flags)) {
> 
> >

Re: [FFmpeg-devel] [PATCH V4 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-25 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting Fu
> Sent: Thursday, December 19, 2019 11:36 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V4 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> Tested using this command:
> /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> 
> The fps increase from 389 to 640 on my local machine.
> 
> Signed-off-by: Ting Fu 
> ---
>  libswscale/x86/yuv2rgb.c  |   8 +-
>  libswscale/x86/yuv2rgb_template.c |  58 +++-
>  libswscale/x86/yuv_2_rgb.asm  | 145 ++
>  3 files changed, 192 insertions(+), 19 deletions(-)
> 
> diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c index
> ed9b613cab..b83dd7089a 100644
> --- a/libswscale/x86/yuv2rgb.c
> +++ b/libswscale/x86/yuv2rgb.c
[...]
> +
> +INIT_XMM ssse3
> +yuv2rgb_fn yuv,  rgb, 24
> +yuv2rgb_fn yuv,  bgr, 24
> +yuv2rgb_fn yuv,  rgb, 32
> +yuv2rgb_fn yuv,  bgr, 32
> +yuv2rgb_fn yuva, rgb, 32
> +yuv2rgb_fn yuva, bgr, 32
> +yuv2rgb_fn yuv,  rgb, 15
> +yuv2rgb_fn yuv,  rgb, 16
> --
> 2.17.1

A kindly ping.

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-17 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Henrik
> Gramner
> Sent: Tuesday, December 17, 2019 08:29 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> On Wed, Dec 4, 2019 at 4:03 AM Ting Fu  wrote:
> > +VBROADCASTSD y_offset, [pointer_c_ditherq + 8  * 8]
> > +VBROADCASTSD u_offset, [pointer_c_ditherq + 9  * 8]
> > +VBROADCASTSD v_offset, [pointer_c_ditherq + 10 * 8]
> > +VBROADCASTSD ug_coff,  [pointer_c_ditherq + 7  * 8]
> > +VBROADCASTSD vg_coff,  [pointer_c_ditherq + 6  * 8]
> > +VBROADCASTSD y_coff,   [pointer_c_ditherq + 3  * 8]
> > +VBROADCASTSD ub_coff,  [pointer_c_ditherq + 5  * 8]
> > +VBROADCASTSD vr_coff,  [pointer_c_ditherq + 4  * 8]
> [...]
> > +vpbroadcastq m2, mu_offset
> > +vpbroadcastq m3, mv_offset
> > +vpbroadcastq m4, my_offset
> 
> VBROADCASTSD/vpbroadcastq -> movddup

Hi, these codes using VBROADCASTSD intended for reuse in future avx2 code, so 
it may be better to keep it?
And you are right, the vpbroadcastq should be changed. But VBROADCASTSD may be 
a better choice?

> 
> > +mova m2, m0
> > +mova m3, m1
> > +vpbroadcastq m4, mug_coff
> > +vpbroadcastq m5, mvg_coff
> > +pmulhw m2, m4
> > +pmulhw m3, m5
> 
> The register-register moves can be eliminated:
> movddup m2, mug_coff
> movddup m3, mvg_coff
> pmulhw m2, m0
> pmulhw m3, m1

Yes, sure, I will modify it in next patch.

> 
> > +mova m0, m3
> > +pshufb m0, [mask_evenword] ; R2 G2 R6 G6 R10 G10 R14 G14 -- -- -- -- 
> > -- --
> -- --
> > +mova m1, m2
> > +pshufb m1, [mask_oddword]  ; G1 B1 G5 B5 G9 B9 G13 B13 -- -- -- -- -- 
> > -- --
> --
> > +punpcklwd m1, m0   ; G1 B1 R2 G2 G5 B5 R6 G6 G9 B9 R10 G10 G13
> B13 R14 G14
> > +mova m0,m6
> > +pshufb m0, [mask_evenword] ; B2 R3 B6 R7 B10 R11 B14 R15 -- -- -- -- 
> > -- --
> -- --
> > +mova m4, m2
> > +pshufb m4, [mask_evenword] ; G3 B3 G7 B7 G11 B11 G15 G15 -- -- -- -- 
> > -- --
> -- --
> > +punpcklwd m0, m4
> > +pshufb m3, [mask_oddword]  ; R0 G0 R4 G4 R8 G8 R12 G12 -- -- -- -- -- 
> > -- --
> --
> > +pshufb m6, [mask_oddword]  ; B0 R1 B4 R5 B8 R9 B12 R13 -- -- -- -- -- 
> > -- -- -
> -
> > +mova m5, m0
> > +mova m7, m1
> > +punpcklwd m3, m6 ; R0  G0  B0  R1  R4  G4  B4  R5  R8  G9  B8  R9  R12 
> > G12
> B12 R13
> > +punpckldq m7, m5 ; G1  B1  R2  G2  B2  R3  G3  B3  G5  B5  R5  G5  B6  
> > R7  G7
> B7
> > +punpckhdq m1, m0 ; G9  B9  R10 G10 B10 R11 G11 B11 G13 B13 R14 G14
> B14 R15 G15 B15
> > +mova m0, m3
> > +mova m2, m7
> > +pshufb m0, [mask_dw01to03] ; R0 G0 B0 R1 -- -- -- -- -- -- -- -- R4 G4 
> > B4 R5
> > +pshufb m2, [mask_dw01to12] ; -- -- -- -- G1 B1 R2 G2 B2 R3 G3 B3 -- -- 
> > -- --
> > +por m0, m2 ; R0 G0 B0 R1 G1 B1 R2 G2 B2 R3 G3 B3 R4 G4 
> > B4 R5
> > +mova m2, m3
> > +mova m4, m7
> > +pshufb m2, [mask_dw2to2]   ; -- -- -- -- -- -- -- -- R8 G8 B8 R9 -- -- 
> > -- --
> > +pshufb m4, [mask_dw23to01] ; G5 B5 R6 G6 B6 R7 G7 B7 -- -- -- -- -- -- 
> > -- --
> > +por m2, m4
> > +mova m4, m1
> > +pshufb m4, [mask_dw0to3]   ; -- -- -- -- -- -- -- -- -- -- -- -- G9 B9 
> > R10 G10
> > +por m2, m4 ; G5 B5 R6 G6 B6 R7 G7 B7 R8 G8 B8 R9 G9 B9 
> > R10 G10
> > +pshufb m3, [mask_dw3to1] ; --- --- --- --- R12 G12 B12 R13 --- --- 
> > --- --- --
> - --- --- ---
> > +pshufb m1, [mask_dw123to023] ; B10 R11 G11 B11 --- --- --- --- G13 B13
> R14 G14 B14 R15 G15 B15
> > +por m1, m3   ; B10 R11 G11 B11 R12 G12 B12 R13 G13 B13 
> > R14
> G14 B14 R15 G15 B15
> 
> Probably faster to do fewer shuffles in favor of masking instead, e.g.
> something along the lines of
> 
> rgb_shuf1: db  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15,  4,  5, 10, 11
> rgb_shuf2: db 10, 11,  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15,  4,  5
> rgb_shuf3: db  4,  5, 10, 11,  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15
> rgb_mask1: db -1, -1,  0,  0,  0,  0, -1, -1,  0,  0,  0,  0, -1, -1,  0,  0
> rgb_mask2: db  0,  0, -1, -1,  0,  0,  0,  0, -1, -1,  0,  0,  0,  0, -1, -1
> rgb_mask3: db  0,  0,  0,  0, -1, -1,  0,  0,  0,  0, -1, -1,  0,  0,  0,  0 
> [...] pshufb m3,
> [rgb_shuf1] ; r0  g0  r6  g6  r12 g12 r2  g2  r8  g8  r14
> g14 r4  g4  r10 g10
> pshufb m6, [rgb_shuf2] ; b10 r11 b0  r1  b6  r7  b12 r13 b2  r3  b8
> r9  b14 r15 b4  r5
> pshufb m2, [rgb_shuf3] ; g5  b5  g11 b11 g1  b1  g7  b7  g13 b13 g3
> b3  g9  b9  g15 b15
> mova   m7, [rgb_mask1]
> mova   m4, [rgb_mask2]
> mova   m5, [rgb_mask3]
> pand   m0, m7, m3  ; r0  g0  ___ ___ ___ ___ r2  g2  ___ ___ ___
> ___ r4  g4  ___ ___
> pand   m1, m4, m6  ; ___ ___ b0  r1  ___ ___ ___ ___ b2  r3  ___
> ___ ___ ___ b4  r5
> porm0, m1
> pand   m1, m5, m2  ; ___ ___ ___ ___ g1  b1  ___ ___ ___ ___ g3
> b3  ___ ___ ___ ___
> porm0, m1  ;

Re: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-16 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Fu,
> Ting
> Sent: Monday, December 9, 2019 09:49 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add
> ssse3 version
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> Ting
> > Fu
> > Sent: Wednesday, December 4, 2019 11:00 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add
> > ssse3 version
> >
> > Tested using this command:
> > /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> > rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >
> > The fps increase from 389 to 640 on my local machine.
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/yuv2rgb.c  |   8 +-
> >  libswscale/x86/yuv2rgb_template.c |  58 ++-
> >  libswscale/x86/yuv_2_rgb.asm  | 162 +++---
> >  3 files changed, 209 insertions(+), 19 deletions(-)
> >
> [...]
> 
> Ping.

Ping?

> 
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-08 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting Fu
> Sent: Wednesday, December 4, 2019 11:00 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V3 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> Tested using this command:
> /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> 
> The fps increase from 389 to 640 on my local machine.
> 
> Signed-off-by: Ting Fu 
> ---
>  libswscale/x86/yuv2rgb.c  |   8 +-
>  libswscale/x86/yuv2rgb_template.c |  58 ++-
>  libswscale/x86/yuv_2_rgb.asm  | 162 +++---
>  3 files changed, 209 insertions(+), 19 deletions(-)
> 
[...]

Ping.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2019-12-03 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Carl
> Eugen Hoyos
> Sent: Tuesday, December 3, 2019 04:23 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline
> assembly into nasm code
> 
> Am Di., 3. Dez. 2019 um 04:53 Uhr schrieb Fu, Ting :
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Carl Eugen Hoyos
> > > Sent: Monday, December 2, 2019 05:49 PM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > Am Mo., 2. Dez. 2019 um 04:17 Uhr schrieb Fu, Ting :
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: ffmpeg-devel  On Behalf
> > > > > Of Michael Niedermayer
> > > > > Sent: Friday, November 29, 2019 05:33 AM
> > > > > To: FFmpeg development discussions and patches
> > > > > 
> > > > > Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb:
> > > > > Change inline assembly into nasm code
> > > > >
> > > > > On Thu, Nov 28, 2019 at 02:07:07PM +0800, Ting Fu wrote:
> > > > > > Signed-off-by: Ting Fu 
> > > > > > ---
> > > > > >  libswscale/x86/Makefile   |   1 +
> > > > > >  libswscale/x86/swscale.c  |  16 +-
> > > > > >  libswscale/x86/yuv2rgb.c  |  81 ++
> > > > > >  libswscale/x86/yuv2rgb_template.c | 441 
> > > > > > ++
> > > > > >  libswscale/x86/yuv_2_rgb.asm  | 270 ++
> > > > > >  5 files changed, 394 insertions(+), 415 deletions(-)  create
> > > > > > mode
> > > > > > 100644 libswscale/x86/yuv_2_rgb.asm
> > > > >
> > > > > This changes the output, i presume that is unintentional
> > > > >
> > > > > ./ffmpeg -cpuflags 0 -i matrixbench_mpeg2.mpg -t 1 -vf
> > > > > format=yuv420p,format=rgb565le -an -f framecrc -
> > > > >
> > > > > 0,  0,  0,1,   829440, 0x1bd78b86
> > > > > 0,  1,  1,1,   829440, 0x85910b33
> > > > > ...
> > > > > vs.
> > > > > 0,  0,  0,1,   829440, 0x31f4a2bd
> > > > > 0,  1,  1,1,   829440, 0xf0c66218
> > > > > ...
> > > > >
> > > > >
> > > >
> > > > Hi Michael,
> > > >
> > > > This unexpected change is because of the missing verify of current
> > > > SIMD
> > > support.
> > > > So, when cpuflag=0, ffmpeg used mmx code to compute as default.
> > > > I added if (EXTERNAL_XXX(cpu_flags)) to verify the SIMD in
> > > libswscale/x86/yuv2rgb.c.
> > >
> > > Could the patch be split to make this change easier to understand?
> >
> > Hi Carl,
> >
> > I didn’t come across any good idea to separate the PATCH.
> > Since the [PATCH 1/2] is consisted of mmx code for
> yuv2rgb24/bgr24/rgb32/bgr32/rgb15/rgb16 and they're all come from former
> inline assembly.
> > Should it be separated into something like PATCH 1: mmx
> > yuv2rgb24/bgr24 PATCH 2: mmx yuv2rgb32/bgr32 PATCH 3: mmx
> > yuv2rgb15/rgb16 Or adding more comments in nasm file would be more
> > helpful?
> >
> > Can you show me if there is any better solution? I cannot be more grateful 
> > to
> it.
> 
> I didn't want to imply that I know of a better way, just that your answer to
> Michael's question made me wonder if adding EXTERNAL_XXX() in a separate
> patch would fix his concern.

Hi Carl,

The EXTERNAL_XXX() is added to fix the bug of compute under "-cpuflags 0" 
condition.
Without which the ffmpeg would use mmx code as default to compute unscale 
yuv2rgb part.
So, in my opinion it is inconvenient to put it in a separate part. 
Plus, I have added more comments in the nasm file in PATCH V3. Hope it helps.

Thank you so much for your review.
Ting Fu

> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-03 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Tuesday, December 3, 2019 04:11 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> On Mon, Dec 02, 2019 at 11:12:42AM +0800, Ting Fu wrote:
> > Tested using this command:
> > /ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \ -vcodec
> > rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> >
> > The fps increase from 389 to 640 on my local machine.
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/yuv2rgb.c  |   8 +-
> >  libswscale/x86/yuv2rgb_template.c |  58 ++-
> >  libswscale/x86/yuv_2_rgb.asm  | 162 +++---
> >  3 files changed, 209 insertions(+), 19 deletions(-)
> 
> one of these patches seems to produce new warnings like:
> libswscale/x86/yuv2rgb_template.c: In function ‘yuv420_rgb15’:
> libswscale/x86/yuv2rgb_template.c:113:5: warning: passing argument 5 of
> ‘ff_yuv_420_rgb15_ssse3’ from
> 

Hi Michael,

This warning is because that the type of one formal parameter in 
ff_yuv_420_rgbXX_() has been set as uint8_t.
But it is uint64_t actually. I have corrected it in PATCH V3.
Thank you for your review, I would pay more attention to the warning.

Ting Fu

> 
> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> The real ebay dictionary, page 2
> "100% positive feedback" - "All either got their money back or didnt complain"
> "Best seller ever, very honest" - "Seller refunded buyer after failed scam"
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2019-12-02 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Carl
> Eugen Hoyos
> Sent: Monday, December 2, 2019 05:49 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline
> assembly into nasm code
> 
> Am Mo., 2. Dez. 2019 um 04:17 Uhr schrieb Fu, Ting :
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of
> > > Michael Niedermayer
> > > Sent: Friday, November 29, 2019 05:33 AM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > On Thu, Nov 28, 2019 at 02:07:07PM +0800, Ting Fu wrote:
> > > > Signed-off-by: Ting Fu 
> > > > ---
> > > >  libswscale/x86/Makefile   |   1 +
> > > >  libswscale/x86/swscale.c  |  16 +-
> > > >  libswscale/x86/yuv2rgb.c  |  81 ++
> > > >  libswscale/x86/yuv2rgb_template.c | 441 ++
> > > >  libswscale/x86/yuv_2_rgb.asm  | 270 ++
> > > >  5 files changed, 394 insertions(+), 415 deletions(-)  create mode
> > > > 100644 libswscale/x86/yuv_2_rgb.asm
> > >
> > > This changes the output, i presume that is unintentional
> > >
> > > ./ffmpeg -cpuflags 0 -i matrixbench_mpeg2.mpg -t 1 -vf
> > > format=yuv420p,format=rgb565le -an -f framecrc -
> > >
> > > 0,  0,  0,1,   829440, 0x1bd78b86
> > > 0,  1,  1,1,   829440, 0x85910b33
> > > ...
> > > vs.
> > > 0,  0,  0,1,   829440, 0x31f4a2bd
> > > 0,  1,  1,1,   829440, 0xf0c66218
> > > ...
> > >
> > >
> >
> > Hi Michael,
> >
> > This unexpected change is because of the missing verify of current SIMD
> support.
> > So, when cpuflag=0, ffmpeg used mmx code to compute as default.
> > I added if (EXTERNAL_XXX(cpu_flags)) to verify the SIMD in
> libswscale/x86/yuv2rgb.c.
> 
> Could the patch be split to make this change easier to understand?

Hi Carl,

I didn’t come across any good idea to separate the PATCH. 
Since the [PATCH 1/2] is consisted of mmx code for 
yuv2rgb24/bgr24/rgb32/bgr32/rgb15/rgb16 and they're all come from former inline 
assembly.
Should it be separated into something like 
PATCH 1: mmx yuv2rgb24/bgr24
PATCH 2: mmx yuv2rgb32/bgr32
PATCH 3: mmx yuv2rgb15/rgb16
Or adding more comments in nasm file would be more helpful?

Can you show me if there is any better solution? I cannot be more grateful to 
it.

Thank you,
Ting Fu

> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-12-01 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, November 29, 2019 04:51 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> On Thu, Nov 28, 2019 at 02:07:08PM +0800, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/yuv2rgb.c  |   5 +
> >  libswscale/x86/yuv2rgb_template.c |  58 ++-
> >  libswscale/x86/yuv_2_rgb.asm  | 163 +++---
> >  3 files changed, 208 insertions(+), 18 deletions(-)
> 
> breaks build on x86-32
> make
> X86ASMlibswscale/x86/yuv_2_rgb.o
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: label `BROADCAST' inconsistently
> redefined
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:400: note: label `BROADCAST' originally
> defined here
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:331: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:332: ... from macro `yuv2rgb_fn' defined
> here
> src/libswscale/x86/yuv_2_rgb.asm:401: error: parser: instruction expected
> src/libswscale/x86/yuv_2_rgb.asm:333: ... from macro `yuv2rgb_fn' defined
> here
> make: *** [libswscale/x86/yuv_2_rgb.o] Error
> 

Hi Michael,

This error comes from the macro define of BROADCAST only under ARCH_X86_64.
And I have changed it into VBROADCASTSD(defined in x86util.asm) in PATCH V2.
What's more, the md5 test passed in linux32/64 and windows64.

Thank you for review.
Ting Fu

> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> I do not agree with what you have to say, but I'll defend to the death your 
> right
> to say it. -- Voltaire
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

2019-12-01 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Michael Niedermayer
> Sent: Friday, November 29, 2019 05:33 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] libswscale/x86/yuv2rgb: Change inline
> assembly into nasm code
> 
> On Thu, Nov 28, 2019 at 02:07:07PM +0800, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> >  libswscale/x86/Makefile   |   1 +
> >  libswscale/x86/swscale.c  |  16 +-
> >  libswscale/x86/yuv2rgb.c  |  81 ++
> >  libswscale/x86/yuv2rgb_template.c | 441 ++
> >  libswscale/x86/yuv_2_rgb.asm  | 270 ++
> >  5 files changed, 394 insertions(+), 415 deletions(-)  create mode
> > 100644 libswscale/x86/yuv_2_rgb.asm
> 
> This changes the output, i presume that is unintentional
> 
> ./ffmpeg -cpuflags 0 -i matrixbench_mpeg2.mpg -t 1 -vf
> format=yuv420p,format=rgb565le -an -f framecrc -
> 
> 0,  0,  0,1,   829440, 0x1bd78b86
> 0,  1,  1,1,   829440, 0x85910b33
> ...
> vs.
> 0,  0,  0,1,   829440, 0x31f4a2bd
> 0,  1,  1,1,   829440, 0xf0c66218
> ...
> 
> 

Hi Michael,

This unexpected change is because of the missing verify of current SIMD support.
So, when cpuflag=0, ffmpeg used mmx code to compute as default.
I added if (EXTERNAL_XXX(cpu_flags)) to verify the SIMD in 
libswscale/x86/yuv2rgb.c.

Thank you again
Ting Fu

> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> Republics decline into democracies and democracies degenerate into
> despotisms. -- Aristotle
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3 version

2019-11-27 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Carl
> Eugen Hoyos
> Sent: Thursday, November 28, 2019 02:29 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libswscale/x86/yuv2rgb: add ssse3
> version
> 
> 
> 
> > Am 28.11.2019 um 07:07 schrieb Ting Fu :
> >
> > +#if HAVE_SSSE3
> > +#define COMPILE_TEMPLATE_SSSE3 1
> > +#endif
> 
> Please add a line about performance to the commit message.
> 
> Carl Eugen

Hi Carl,

Sorry for the missing performance info, I tested it with raw YUV format video, 
the command is:
./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i input.yuv -vcodec rawvideo -s 
1920*1080 -pix_fmt rgb24 -f null /dev/null
The outputs are as follows on my local machine:
output fmt RGB24:
mmx: 337fps   ssse3: 634fps
 output fmt RGB32:
mmx: 375fps   ssse3: 653fps
output fmt RGB555:
mmx: 427fps   ssse3: 917fps
And I will add these infos in the PATCH V2.

Tank you
Tin Fu

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq

2019-09-27 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of James
> Almer
> Sent: Friday, September 27, 2019 11:27 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq
> 
> On 9/27/2019 12:25 AM, Fu, Ting wrote:
> >
> >
> >> -Original Message-
> >> From: ffmpeg-devel  On Behalf Of
> >> James Almer
> >> Sent: Thursday, September 26, 2019 11:20 PM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test
> >> for vf_eq
> >>
> >> On 9/26/2019 11:43 AM, Andreas Rheinhardt wrote:
> >>> Ting Fu:
> >>>> Signed-off-by: Ting Fu 
> >>>> ---
> >>>>  libavfilter/vf_eq.c   | 13 ---
> >>>>  libavfilter/vf_eq.h   |  1 +
> >>>>  tests/checkasm/Makefile   |  1 +
> >>>>  tests/checkasm/checkasm.c |  3 ++
> >>>>  tests/checkasm/checkasm.h |  1 +
> >>>>  tests/checkasm/vf_eq.c| 79
> >> +++
> >>>>  tests/fate/checkasm.mak   |  1 +
> >>>>  7 files changed, 94 insertions(+), 5 deletions(-)  create mode
> >>>> 100644 tests/checkasm/vf_eq.c
> >>>>
> >>>> diff --git a/libavfilter/vf_eq.c b/libavfilter/vf_eq.c index
> >>>> 2c4c7e4d54..0f9d129255 100644
> >>>> --- a/libavfilter/vf_eq.c
> >>>> +++ b/libavfilter/vf_eq.c
> >>>> @@ -174,12 +174,18 @@ static int set_expr(AVExpr **pexpr, const
> >>>> char
> >> *expr, const char *option, void *
> >>>>  return 0;
> >>>>  }
> >>>>
> >>>> +void ff_eq_init(EQContext *eq)
> >>>> +{
> >>>> +eq->process = process_c;
> >>>> +if (ARCH_X86)
> >>>> +ff_eq_init_x86(eq);
> >>>> +}
> >>>> +
> >>>>  static int initialize(AVFilterContext *ctx)  {
> >>>>  EQContext *eq = ctx->priv;
> >>>>  int ret;
> >>>> -
> >>>> -eq->process = process_c;
> >>>> +ff_eq_init(eq);
> >>>>
> >>>>  if ((ret = set_expr(>contrast_pexpr, eq->contrast_expr,
> "contrast",
> >> ctx)) < 0 ||
> >>>>  (ret = set_expr(>brightness_pexpr,   eq->brightness_expr,
> >> "brightness",   ctx)) < 0 ||
> >>>> @@ -191,9 +197,6 @@ static int initialize(AVFilterContext *ctx)
> >>>>  (ret = set_expr(>gamma_weight_pexpr,
> >>>> eq->gamma_weight_expr,
> >> "gamma_weight", ctx)) < 0 )
> >>>>  return ret;
> >>>>
> >>>> -if (ARCH_X86)
> >>>> -ff_eq_init_x86(eq);
> >>>> -
> >>>>  if (eq->eval_mode == EVAL_MODE_INIT) {
> >>>>  set_gamma(eq);
> >>>>  set_contrast(eq);
> >>>> diff --git a/libavfilter/vf_eq.h b/libavfilter/vf_eq.h index
> >>>> fa49d46e5c..cd0cd75f08 100644
> >>>> --- a/libavfilter/vf_eq.h
> >>>> +++ b/libavfilter/vf_eq.h
> >>>> @@ -100,6 +100,7 @@ typedef struct EQContext {
> >>>>  enum EvalMode { EVAL_MODE_INIT, EVAL_MODE_FRAME,
> >> EVAL_MODE_NB }
> >>>> eval_mode;  } EQContext;
> >>>>
> >>>> +void ff_eq_init(EQContext *eq);
> >>>>  void ff_eq_init_x86(EQContext *eq);
> >>>>
> >>>>  #endif /* AVFILTER_EQ_H */
> >>>> diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
> >>>> index 0112ff603e..de850c016e 100644
> >>>> --- a/tests/checkasm/Makefile
> >>>> +++ b/tests/checkasm/Makefile
> >>>> @@ -36,6 +36,7 @@ CHECKASMOBJS-$(CONFIG_AVCODEC)  +=
> >> $(AVCODECOBJS-yes)
> >>>>  AVFILTEROBJS-$(CONFIG_AFIR_FILTER) += af_afir.o
> >>>>  AVFILTEROBJS-$(CONFIG_BLEND_FILTER) += vf_blend.o
> >>>>  AVFILTEROBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o
> >>>> +AVFILTEROBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
> >>>>  AVFILTEROBJS-$(CONFIG_GBLUR_FILTER)  += vf_gblur.o
> >>>>  AVFILTEROBJS-$(CONFIG_HFLIP_FILTER)  += vf_hflip.o
> >>>>  AVFILTEROBJS-$(CONFIG_THRESHOLD_FILTER)  += vf_threshold.o diff
> >>>> --git a/test

Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq

2019-09-26 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel  On Behalf Of James
> Almer
> Sent: Thursday, September 26, 2019 11:20 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq
> 
> On 9/26/2019 11:43 AM, Andreas Rheinhardt wrote:
> > Ting Fu:
> >> Signed-off-by: Ting Fu 
> >> ---
> >>  libavfilter/vf_eq.c   | 13 ---
> >>  libavfilter/vf_eq.h   |  1 +
> >>  tests/checkasm/Makefile   |  1 +
> >>  tests/checkasm/checkasm.c |  3 ++
> >>  tests/checkasm/checkasm.h |  1 +
> >>  tests/checkasm/vf_eq.c| 79
> +++
> >>  tests/fate/checkasm.mak   |  1 +
> >>  7 files changed, 94 insertions(+), 5 deletions(-)  create mode
> >> 100644 tests/checkasm/vf_eq.c
> >>
> >> diff --git a/libavfilter/vf_eq.c b/libavfilter/vf_eq.c index
> >> 2c4c7e4d54..0f9d129255 100644
> >> --- a/libavfilter/vf_eq.c
> >> +++ b/libavfilter/vf_eq.c
> >> @@ -174,12 +174,18 @@ static int set_expr(AVExpr **pexpr, const char
> *expr, const char *option, void *
> >>  return 0;
> >>  }
> >>
> >> +void ff_eq_init(EQContext *eq)
> >> +{
> >> +eq->process = process_c;
> >> +if (ARCH_X86)
> >> +ff_eq_init_x86(eq);
> >> +}
> >> +
> >>  static int initialize(AVFilterContext *ctx)  {
> >>  EQContext *eq = ctx->priv;
> >>  int ret;
> >> -
> >> -eq->process = process_c;
> >> +ff_eq_init(eq);
> >>
> >>  if ((ret = set_expr(>contrast_pexpr, eq->contrast_expr, 
> >> "contrast",
> ctx)) < 0 ||
> >>  (ret = set_expr(>brightness_pexpr,   eq->brightness_expr,
> "brightness",   ctx)) < 0 ||
> >> @@ -191,9 +197,6 @@ static int initialize(AVFilterContext *ctx)
> >>  (ret = set_expr(>gamma_weight_pexpr, eq->gamma_weight_expr,
> "gamma_weight", ctx)) < 0 )
> >>  return ret;
> >>
> >> -if (ARCH_X86)
> >> -ff_eq_init_x86(eq);
> >> -
> >>  if (eq->eval_mode == EVAL_MODE_INIT) {
> >>  set_gamma(eq);
> >>  set_contrast(eq);
> >> diff --git a/libavfilter/vf_eq.h b/libavfilter/vf_eq.h index
> >> fa49d46e5c..cd0cd75f08 100644
> >> --- a/libavfilter/vf_eq.h
> >> +++ b/libavfilter/vf_eq.h
> >> @@ -100,6 +100,7 @@ typedef struct EQContext {
> >>  enum EvalMode { EVAL_MODE_INIT, EVAL_MODE_FRAME,
> EVAL_MODE_NB }
> >> eval_mode;  } EQContext;
> >>
> >> +void ff_eq_init(EQContext *eq);
> >>  void ff_eq_init_x86(EQContext *eq);
> >>
> >>  #endif /* AVFILTER_EQ_H */
> >> diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile index
> >> 0112ff603e..de850c016e 100644
> >> --- a/tests/checkasm/Makefile
> >> +++ b/tests/checkasm/Makefile
> >> @@ -36,6 +36,7 @@ CHECKASMOBJS-$(CONFIG_AVCODEC)  +=
> $(AVCODECOBJS-yes)
> >>  AVFILTEROBJS-$(CONFIG_AFIR_FILTER) += af_afir.o
> >>  AVFILTEROBJS-$(CONFIG_BLEND_FILTER) += vf_blend.o
> >>  AVFILTEROBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o
> >> +AVFILTEROBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
> >>  AVFILTEROBJS-$(CONFIG_GBLUR_FILTER)  += vf_gblur.o
> >>  AVFILTEROBJS-$(CONFIG_HFLIP_FILTER)  += vf_hflip.o
> >>  AVFILTEROBJS-$(CONFIG_THRESHOLD_FILTER)  += vf_threshold.o diff
> >> --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index
> >> d9a5c7f401..bcbe775510 100644
> >> --- a/tests/checkasm/checkasm.c
> >> +++ b/tests/checkasm/checkasm.c
> >> @@ -165,6 +165,9 @@ static const struct {
> >>  #if CONFIG_COLORSPACE_FILTER
> >>  { "vf_colorspace", checkasm_check_colorspace },
> >>  #endif
> >> +#if CONFIG_EQ_FILTER
> >> +{ "vf_eq", checkasm_check_vf_eq },
> >> +#endif
> >>  #if CONFIG_GBLUR_FILTER
> >>  { "vf_gblur", checkasm_check_vf_gblur },
> >>  #endif
> >> diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
> >> index fdf9eeb75d..0a7f9f25c4 100644
> >> --- a/tests/checkasm/checkasm.h
> >> +++ b/tests/checkasm/checkasm.h
> >> @@ -72,6 +72,7 @@ void checkasm_check_sw_rgb(void);  void
> >> checkasm_check_utvideodsp(void);  void checkasm_check_v210dec(void);
> >> void checkasm_check_v210enc(void);
> >> +void checkasm_check_vf_eq(void);
> >>  void checkasm_check_vf_gblur(void);
> >>  void checkasm_check_vf_hflip(void);
> >>  void checkasm_check_vf_threshold(void);
> >> diff --git a/tests/checkasm/vf_eq.c b/tests/checkasm/vf_eq.c new file
> >> mode 100644 index 00..684718f2cd
> >> --- /dev/null
> >> +++ b/tests/checkasm/vf_eq.c
> >> @@ -0,0 +1,79 @@
> >> +/*
> >> + * This file is part of FFmpeg.
> >> + *
> >> + * FFmpeg is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published
> >> +by
> >> + * the Free Software Foundation; either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * FFmpeg is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public

Re: [FFmpeg-devel] [PATCH V2 2/3] avfilter/x86/vf_eq: Change inline assembly into nasm code

2019-09-22 Thread Fu, Ting

> -Original Message-
> From: ffmpeg-devel  On Behalf Of Ting Fu
> Sent: Wednesday, September 18, 2019 03:06 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V2 2/3] avfilter/x86/vf_eq: Change inline
> assembly into nasm code

Ping?
> 
> Signed-off-by: Ting Fu 
> ---
>  libavfilter/x86/Makefile |  3 +-
>  libavfilter/x86/vf_eq.asm| 82 ++
>  libavfilter/x86/vf_eq.c  | 96 
>  libavfilter/x86/vf_eq_init.c | 55 +
>  4 files changed, 139 insertions(+), 97 deletions(-)  create mode 100644
> libavfilter/x86/vf_eq.asm  delete mode 100644 libavfilter/x86/vf_eq.c  create
> mode 100644 libavfilter/x86/vf_eq_init.c
> 
[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] avfilter/x86/vf_eq: add SSE2 version

2019-09-18 Thread Fu, Ting



> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> James Almer
> Sent: Tuesday, September 17, 2019 09:56 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 3/3] avfilter/x86/vf_eq: add SSE2 version
> 
> On 9/17/2019 10:39 AM, Ting Fu wrote:
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/x86/vf_eq.asm| 19 +--
> >  libavfilter/x86/vf_eq_init.c | 20 
> >  2 files changed, 37 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavfilter/x86/vf_eq.asm b/libavfilter/x86/vf_eq.asm
> > index bf28691297..d6b51cf6df 100644
> > --- a/libavfilter/x86/vf_eq.asm
> > +++ b/libavfilter/x86/vf_eq.asm
> > @@ -24,14 +24,21 @@
> >
> >  SECTION .text
> >
> > -INIT_MMX mmx
> > +%macro PROCESS_ONE_LINE 1
> >  cglobal process_one_line, 5, 7, 5, src, dst, contrast, brightness, w
> >  movd m3, contrastd
> >  movd m4, brightnessd
> >  movsx r5d, contrastw
> >  movsx r6d, brightnessw
> > +%if mmsize == 8
> >  pshufw m3, m3, 0
> >  pshufw m4, m4, 0
> 
> pshufw isn't mmx, but mmxext (AKA, integer SSE). Hardly a problem since i 
> don't
> think anyone using a Pentium 2 will use this filter, but still it's 
> technically wrong.
> 
> The function should be changed into mmxext to reflect the above.
> 

Thank you for your reviw. It's ture that pshufw belongs to mmxext, and I have 
updated the related codes.

> > +%elif mmsize == 16
> > +pshuflw m3, m3, 0
> > +movlhps m3, m3
> > +pshuflw m4, m4, 0
> > +movlhps m4, m4
> > +%endif
> 
> You can use the SPLATW macro instead. It will expand to the correct 
> instructions
> based on what set is enabled.

Yes, the SPLATW macro can totally replace the instructions above. Sorry, I 
didn't know before.
And I have replaced above codes with SPLATW in PATCH V2.

> 
> >
> >  DEFINE_ARGS src, dst, tmp, scalar, w
> >  xor tmpd, tmpd
> > @@ -39,7 +46,7 @@ cglobal process_one_line, 5, 7, 5, src, dst, contrast,
> brightness, w
> >  pxor m1, m1
> >  mov scalard, wd
> >  and scalard, mmsize-1
> > -sar wd, 3
> > +sar wd, %1
> >  cmp wd, 1
> >  jl .loop1
> >
> > @@ -80,3 +87,11 @@ cglobal process_one_line, 5, 7, 5, src, dst,
> > contrast, brightness, w
> >
> >  .end:
> >  RET
> > +
> > +%endmacro
> > +
> > +INIT_MMX mmx
> > +PROCESS_ONE_LINE 3
> > +
> > +INIT_XMM sse2
> > +PROCESS_ONE_LINE 4
> > diff --git a/libavfilter/x86/vf_eq_init.c
> > b/libavfilter/x86/vf_eq_init.c index 63c69078fb..cdd5272220 100644
> > --- a/libavfilter/x86/vf_eq_init.c
> > +++ b/libavfilter/x86/vf_eq_init.c
> > @@ -28,6 +28,8 @@
> >
> >  extern void ff_process_one_line_mmx(const uint8_t *src, uint8_t *dst, int
> contvec,
> >  int brvec, int w);
> > +extern void ff_process_one_line_sse2(const uint8_t *src, uint8_t *dst, int
> contvec,
> > +int brvec, int w);
> >
> >  static void process_mmx(EQParameters *param, uint8_t *dst, int dst_stride,
> >  const uint8_t *src, int src_stride, int w,
> > int h) @@ -44,6 +46,21 @@ static void process_mmx(EQParameters *param,
> uint8_t *dst, int dst_stride,
> >  emms_c();
> >  }
> >
> > +static void process_sse2(EQParameters *param, uint8_t *dst, int dst_stride,
> > +const uint8_t *src, int src_stride, int w,
> > +int h) {
> > +short contrast = (short) (param->contrast * 256 * 16);
> > +short brightness = ((short) (100.0 * param->brightness + 100.0) * 511)
> > +   / 200 - 128 - contrast / 32;
> > +
> > +while (h--) {
> > +ff_process_one_line_sse2(src, dst, contrast, brightness, w);
> > +src += src_stride;
> > +dst += dst_stride;
> > +}
> > +emms_c();
> 
> No need for this since the SSE2 version uses xmm regs.
>

Of course the xmm registers no longer need the emms_c(), my mistake.
Thank for your review again and I have post PATH V2 to ffmpeg-devel.
It's welcome to take a further look.
 
> > +}
> > +
> >  av_cold void ff_eq_init_x86(EQContext *eq)  {
> >  int cpu_flags = av_get_cpu_flags(); @@ -51,5 +68,8 @@ av_cold
> > void ff_eq_init_x86(EQContext *eq)
> >  if (cpu_flags & AV_CPU_FLAG_MMX) {
> >  eq->process = process_mmx;
> >  }
> > +if (cpu_flags & AV_CPU_FLAG_SSE2) {
> > +eq->process = process_sse2;
> > +}
> >  }
> >
> >
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org
> with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

54 matches

Mail list logo