Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Pedro Arthur
2018-07-26 23:19 GMT-03:00 Ronald S. Bultje :
> Hi,
>
> On Thu, Jul 26, 2018 at 10:04 PM, Pedro Arthur  wrote:
>
>> If you compare NN weights with quantization tables they are pretty
>> similar
>
>
> https://chromium.googlesource.com/webm/libvpx/+/3b9c19aaa7b8830a896c5f578a3ce6c6a7953947%5E%21/#F0
>
> So, that one tiny single function is how VP9/AV1 quant tables are generated.
>
> Or, the HEVC/H264 ones, they are even simpler: exp2(qp/6).
>
> Are NN weights a single, one-line (10-character) expression? Please
> elaborate. Why isn't that 10-character function documented anywhere?
I think you missed the point, I wrote "can be obtained from a training
process over a dataset so it achieves better results
(quality/compression)".
Taking the vp9 as example, sure the coeficients are obtained by the
'poly3' but the real data are the polynomial coeficients, does any one
asks where these polynomial coeficients came from, reproducibility,
etc? Your comparison does not seems fair to me.

>
> Ronald
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 3/3] lavfi/motion_estimation: use pixelutils API for sad.

2018-07-26 Thread myp...@gmail.com
On Mon, Jul 23, 2018 at 2:33 AM Marton Balint  wrote:
>
>
>
> On Tue, 17 Jul 2018, myp...@gmail.com wrote:
>
> > On Sun, Jul 15, 2018 at 1:03 AM Michael Niedermayer
> >  wrote:
> >>
> >> On Sat, Jul
> >> 14, 2018 at 12:04:46PM +0200, Marton Balint wrote:
> >> >
> >> >
> >> > On Sat, 14 Jul 2018, Michael Niedermayer wrote:
> >> >
> >> > >On Fri, Jul 13, 2018 at 10:51:00AM +0200, Marton Balint wrote:
> >> > >>
> >> > >>
> >> > >>On Thu, 12 Jul 2018, myp...@gmail.com wrote:
> >> > >>
> >> > >>>On Thu, Jul 12, 2018 at 12:43 AM Marton Balint  wrote:
> >> > 
> >> > 
> >> > 
> >> > On Wed, 11 Jul 2018, Jun Zhao wrote:
> >> > 
> >> > >use pixelutils API for sad in motion estimation.
> >> > 
> >> > Does it make sense to improve this code? I thought a superior and 
> >> > faster
> >> > approach was a result of 2017 GSOC task:
> >> > 
> >> > https://docs.google.com/document/d/1Hyh_rxP1KGsVkg7i7yU8Bcv92z0LIL4r-axpoKfvMFk/edit
> >> > 
> >> > Maybe that code should be merged back, and any further optimalization
> >> > should be done based on that code, no?
> >> > 
> >> > Thanks,
> >> > Marton
> >> > 
> >> > >>>Hi, Marton:
> >> > >>>
> >> > >>>Yes, now I try to improve the
> >> minterpolate, and after use perf
> >> > >>>profiing the commands:
> >> > >>>
> >>
> >> > >>>./ffmpeg -i a.ts -filter_complex
> >> > >>>"minterpolate=mi_mode=mci:mc_mode=aobmc:vsbmc=1" -f null /dev/null
> >> > >>>I found the hotspot is:
> >> > >>>- get_sbad_ob
> >> > >>>- get_sbad
> >> > >>>- get_sad_ob
> >> > >>>- bilateral_obmc
> >> > >>>- set_frame_data
> >> > >>>
> >> > >>>So, as my plan, I will try to use sse2/avx2
> >> > >>>Scatter/Gather, optimized
> >> > >>>sad function (use pixelutils API)
> >> > >>>in  get_sbad_ob /  get_sbad /  get_sad_ob first, for  set_frame_data
> >> > >>>case, maybe need to use Scatter/Gather SIMD instruction.
> >> > >>
> >> > >>That is great, all I am saying we should avoid diverging the two 
> >> > >>brances
> >> > >>(FFmpeg branch, and GSOC 2017 branch), and try to merge back GSOC2017 
> >> > >>if it
> >> > >>can be done with reasonable amount of work before optimizing code, 
> >> > >>otherwise
> >> > >>the GSOC2017 branch will rot and we will lose the result of the GSOC 
> >> > >>task.
> >> > >>
> >> > >>>
> >> > >>>But if some guys have done some improve task in this case, I think
> >> > >>>based on the pre-existing work is the better way.
> >> > >>
> >> > >>Michael was the mentor, maybe he can chip in on what should be done 
> >> > >>here.
> >> > >
> >> > >talk with the author/student who wrote the code, not me :)
> >> >
> >> > Well, his not active here,
> >>
> >> yes but last i heared from him, he was interrested in continuing this 
> >> project
> >> i think ive not heared much from him after that but i  now see that there 
> >> is a
> >> small commit in his repo from 2018 so he is not completely inactive.
> >> I think you should talk with him
> >>
> >>
> >> > and the question is if his work is ready for
> >> > mainline inclusion or not, and if he has done enough valuable work during
> >> > GSOC that its worth working on mainlining it.
> >>
> >> He certainly did valuable work. Looking now at the ML, it seems the more or
> >> less last thing on the ML was the RFC/Discussion thread about libmotion.
> >> In that everyone wanted to dictate the design, and all that was 
> >> contradicting
> >> each other.
> >> If you want to work on unifying this entangled bikeshed ball of conflicting
> >> oppinions, that surely is very welcome. Important is that it ends in 
> >> something
> >> that is practical and high quality.
> >> Personally i think the author should be given more authority in the design.
> >> But again, please talk with the author of this code
> >> I dont remember everything in as much detail about this ...
> >>
> >> also ive added him to the CC
> >>
> >> Thanks
> >>
> >>
> > Now the minterpolate/libmotion auther didn't give a feedback or
> > sugesstion, so I will update patch 1/2  (just add SSE2/AVX2 sad_32x32)
> > with some perf data and hold on the patch 3 about minterpolate, any
> > other comments?
>
> I checked the "libmotion" series, and it seems they are in
> debug/development state and the commits are not clean, so some heavy
> refactoring is needed before applying them anyway.
>
> Do what you prefer, snow codec based motion compenstaion is an additional
> algorithm to the existing code anyway as far as I see.
>
As my point, I prefer  improve current minterpolate filter first, then
we can try to refactor the  "libmotion" series, Done Is Better Than
Perfect, any other comments or suggestion?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Ronald S. Bultje
Hi,

On Thu, Jul 26, 2018 at 10:04 PM, Pedro Arthur  wrote:

> If you compare NN weights with quantization tables they are pretty
> similar


https://chromium.googlesource.com/webm/libvpx/+/3b9c19aaa7b8830a896c5f578a3ce6c6a7953947%5E%21/#F0

So, that one tiny single function is how VP9/AV1 quant tables are generated.

Or, the HEVC/H264 ones, they are even simpler: exp2(qp/6).

Are NN weights a single, one-line (10-character) expression? Please
elaborate. Why isn't that 10-character function documented anywhere?

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Pedro Arthur
Hi,

I'm surprised with this patch, there wasn't any concern raised in the
patch review process.

2018-07-26 16:26 GMT-03:00 Rostislav Pehlivanov :
> As discussed recently, the vf_sr filter and the DNN framework have an
> issue: unreproducable weights and questionable license, as well as
> overall unfitting coding style to the rest of the project.
I think I'm not aware of these discussions could you provide a
reference? I also don't understand why the coding style is unfitting
(again no concern was raised).

>
> The vf_sr filter in particular has weights embedded which weight the
> libavfilter binary by a bit and cannot currently be reproduced.
> There's an overall consensus that NN filters should accept external
> weights only, as the nnedi filter currently does.
>
> So, temporarily remove both until the coding style issues have been
> fixed with the framework and the filter has been modified to accept
> external weights.
What are these issues so we can fix them?

>
> Also, there's a discussion by the Debian folks as to whether to treat
> pretrained NNs as non-free[0], hence its not just our project that's
> affected by the questionable license of distributing pretrained NN
> weights.
>
> Due to the weight of the patch (more than 1mb!) I've uploaded it to
> https://0x0.st/sVEH.patch if anyone wants to test it. The change stat
> is printed below.
>
> [0]: https://lwn.net/Articles/760142/
I took the time to read the whole discussion and in my opinion it is
flawed, except for the storage requirement, which Sergey already
worked on a patch to reduce the stored data.


I think before any discussion, it should be clear what is the ffmpeg
policy on adding data, and what is considered data, and it should be
consistent.

I'll try to address the topics in the above discussion.

First what is data? is it expected that all data stored should be
easily reproducible?
I guess not, what is the point in storing data that is easily reproducible?
The entire humanity is built on previous stored knowledge, namely
data, do we require each time one is going to use some form of
knowledge to reproduce it? that is, proof everything one is using?
The answer is no, the whole point in storing data is that you had once
worked hard to proof it works and onwards just use it and believe it
works. That does not mean it is imposible to proof everything.

I think the above fits perfectly with NN weights as data.

The next point is the reproducibility, one should be reasonable, it is
hard to reproduce bit by bit of NN stored data but is totally doable
to achieve similar results following the same training metodology
used.


Then there is the question what is open-source, once again one should
be reasonable.
The NN model is public available, everything is documented, the math
machinery is also widely available and documented.
There is also a repository containing everything one needs to train
the NN and achieve similar results, the trainig data is public also.
The training software is open-source, the user could also use any
machine learning framework of their choice to perform the training
since the model is documented, there is nothing locking one to a
specific software or hardware.
I can't see what is not open.


Does we impose all requiriments imposed for NN weights on all other
data stored in ffmpeg?
I guess not, once more one should be consistent.


If you compare NN weights with quantization tables they are pretty
similar, both can be obtained from a training process over a dataset
so it achieves better results (quality/compression). Are quantization
tables evil? I don't think so.
It seems people is afraid of  NN just because we give it a fancy name,
while it is just tables of data as we always used in our code.

>
> Signed-off-by: Rostislav Pehlivanov 
>
> Rostislav Pehlivanov (1):
>   libavfilter: temporarily remove DNN framework and vf_sr filter
>
>  Changelog| 1 -
>  configure| 8 -
>  libavfilter/Makefile | 3 -
>  libavfilter/allfilters.c | 1 -
>  libavfilter/dnn_backend_native.c |   495 --
>  libavfilter/dnn_backend_native.h |40 -
>  libavfilter/dnn_backend_tf.c |   325 -
>  libavfilter/dnn_backend_tf.h |40 -
>  libavfilter/dnn_espcn.h  | 12637 -
>  libavfilter/dnn_interface.c  |60 -
>  libavfilter/dnn_interface.h  |63 -
>  libavfilter/dnn_srcnn.h  |  4957 ---
>  libavfilter/vf_sr.c  |   354 -
>  13 files changed, 18984 deletions(-)
>  delete mode 100644 libavfilter/dnn_backend_native.c
>  delete mode 100644 libavfilter/dnn_backend_native.h
>  delete mode 100644 libavfilter/dnn_backend_tf.c
>  delete mode 100644 libavfilter/dnn_backend_tf.h
>  delete mode 100644 libavfilter/dnn_espcn.h
>  delete mode 100644 libavfilter/dnn_interface.c
>  delete mode 100644 libavfilter/dnn_interface.h
>  delete mode 100644 libavfilter/dnn_srcnn.h
>  delete mode 100644 

Re: [FFmpeg-devel] [PATCH] avformat/movenc: implicitly enable negative CTS offsets for ismv

2018-07-26 Thread Michael Niedermayer
On Thu, Jul 26, 2018 at 02:51:38AM +0300, Jan Ekström wrote:
> ISMV lacks any sort of edit list support, as well as tfxd is
> effectively the PTS of the fragment for most intents and purposes.
> 
> Thus, if b-frames are requested without negative CTS offsets you
> end up with N frames' worth of delay (tfxd PTS plus the CTS offset
> of the first sample). Negative CTS offsets enable the first sample
> to have CTS=DTS, and thus a/v desync due to b-frame reorder delay
> is avoided.
> ---
>  doc/muxers.texi   | 2 ++
>  libavformat/movenc.c  | 2 +-
>  tests/ref/fate/movenc | 4 ++--
>  3 files changed, 5 insertions(+), 3 deletions(-)

breaks fate

TESTlavf-ismv
--- ./tests/ref/lavf/ismv   2018-07-20 13:20:28.137581113 +0200
+++ tests/data/fate/lavf-ismv   2018-07-27 00:29:48.709348455 +0200
@@ -1,9 +1,9 @@
-a9ccbb4cd1436d222ef4425567b4e03d *./tests/data/lavf/lavf.ismv
+96053075a3f60d271131fe2d0765c267 *./tests/data/lavf/lavf.ismv
 312542 ./tests/data/lavf/lavf.ismv
 ./tests/data/lavf/lavf.ismv CRC=0x9d9a638a
-440d85f9fd5b9f63c2676638782b5c15 *./tests/data/lavf/lavf.ismv
+7022701b4c693bc4ffe1e9f96dd82a02 *./tests/data/lavf/lavf.ismv
 321448 ./tests/data/lavf/lavf.ismv
 ./tests/data/lavf/lavf.ismv CRC=0xe8130120
-a9ccbb4cd1436d222ef4425567b4e03d *./tests/data/lavf/lavf.ismv
+96053075a3f60d271131fe2d0765c267 *./tests/data/lavf/lavf.ismv
 312542 ./tests/data/lavf/lavf.ismv
 ./tests/data/lavf/lavf.ismv CRC=0x9d9a638a
Test lavf-ismv failed. Look at tests/data/fate/lavf-ismv.err for details.
make: *** [fate-lavf-ismv] Error 1

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

You can kill me, but you cannot change the truth.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] lavfi/nlmeans: fixup aarch64 assembly with clang

2018-07-26 Thread Jan Ekström
Clang is more strict about some things.
---
 libavfilter/aarch64/vf_nlmeans_neon.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavfilter/aarch64/vf_nlmeans_neon.S 
b/libavfilter/aarch64/vf_nlmeans_neon.S
index 6308a428db..ac16157bbd 100644
--- a/libavfilter/aarch64/vf_nlmeans_neon.S
+++ b/libavfilter/aarch64/vf_nlmeans_neon.S
@@ -22,7 +22,7 @@
 
 // acc_sum_store(ABCD) = {X+A, X+A+B, X+A+B+C, X+A+B+C+D}
 .macro acc_sum_store x, xb
-dup v24.4S, v24.4S[3]   // 
...X -> 
+dup v24.4s, v24.s[3]// 
...X -> 
 ext v25.16B, v26.16B, \xb, #12  // 
ext(,ABCD,12)=0ABC
 add v24.4S, v24.4S, \x  // 
+ABCD={X+A,X+B,X+C,X+D}
 add v24.4S, v24.4S, v25.4S  // 
{X+A,X+B+A,X+C+B,X+D+C}   (+0ABC)
@@ -37,7 +37,7 @@ function ff_compute_safe_ssd_integral_image_neon, export=1
 moviv26.4S, #0  // 
used as zero for the "rotations" in acc_sum_store
 sub x3, x3, w6, UXTW// s1 
padding (s1_linesize - w)
 sub x5, x5, w6, UXTW// s2 
padding (s2_linesize - w)
-sub x9, x0, x1, UXTW #2 // 
dst_top
+sub x9, x0, w1, UXTW #2 // 
dst_top
 sub x1, x1, w6, UXTW// dst 
padding (dst_linesize_32 - w)
 lsl x1, x1, #2  // dst 
padding expressed in bytes
 1:  mov w10, w6 // 
width copy for each line
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Carl Eugen Hoyos
2018-07-26 21:26 GMT+02:00, Rostislav Pehlivanov :

> There's an overall consensus that NN filters should accept
> external weights only

Do you mean an overall consensus on irc?
I ask because the patch in question was sent several times for
review, and I don't remember a comment concerning internal
weights. When the issue was brought up on the mailing list, at
least one developer defended the internal weights iirc.
(No opinion here.)

Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Paul B Mahol
On 7/26/18, Thilo Borgmann  wrote:
> Hi,
>
> Am 26.07.18 um 21:26 schrieb Rostislav Pehlivanov:
>> As discussed recently, the vf_sr filter and the DNN framework have an
>> issue: unreproducable weights and questionable license, as well as
>> overall unfitting coding style to the rest of the project.
>>
>> The vf_sr filter in particular has weights embedded which weight the
>> libavfilter binary by a bit and cannot currently be reproduced.
>> There's an overall consensus that NN filters should accept external
>> weights only, as the nnedi filter currently does.
>>
>> So, temporarily remove both until the coding style issues have been
>> fixed with the framework and the filter has been modified to accept
>> external weights.
>>
>> Also, there's a discussion by the Debian folks as to whether to treat
>> pretrained NNs as non-free[0], hence its not just our project that's
>> affected by the questionable license of distributing pretrained NN
>> weights.
>
> I personally don't have a good feeling with pre-trained NNs in the codebase,
> too. However, I do not care much about what solution we take, but Mina's
> GSoC project also depends on the NN code that comes with this and therefore
> I'd encourage everyone to make up their mind to find a suitable solution
> sometime soonish.
>
> Maybe for the time-being, we might only accept such code reading in
> externally provided NNs and/or the ability to train using FFmpeg itself. (Or
> ask one of our kind users with compute power to generate some for us)

IIRC mentioned filter already supports external files. It just have
internal one too.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Thilo Borgmann
Hi,

Am 26.07.18 um 21:26 schrieb Rostislav Pehlivanov:
> As discussed recently, the vf_sr filter and the DNN framework have an
> issue: unreproducable weights and questionable license, as well as
> overall unfitting coding style to the rest of the project.
> 
> The vf_sr filter in particular has weights embedded which weight the
> libavfilter binary by a bit and cannot currently be reproduced.
> There's an overall consensus that NN filters should accept external
> weights only, as the nnedi filter currently does.
> 
> So, temporarily remove both until the coding style issues have been
> fixed with the framework and the filter has been modified to accept
> external weights.
> 
> Also, there's a discussion by the Debian folks as to whether to treat
> pretrained NNs as non-free[0], hence its not just our project that's
> affected by the questionable license of distributing pretrained NN
> weights.

I personally don't have a good feeling with pre-trained NNs in the codebase, 
too. However, I do not care much about what solution we take, but Mina's GSoC 
project also depends on the NN code that comes with this and therefore I'd 
encourage everyone to make up their mind to find a suitable solution sometime 
soonish.

Maybe for the time-being, we might only accept such code reading in externally 
provided NNs and/or the ability to train using FFmpeg itself. (Or ask one of 
our kind users with compute power to generate some for us)


> Due to the weight of the patch (more than 1mb!) I've uploaded it to
> https://0x0.st/sVEH.patch if anyone wants to test it. The change stat
> is printed below.
> 
> [0]: https://lwn.net/Articles/760142/
> 
> Signed-off-by: Rostislav Pehlivanov 
> 
> Rostislav Pehlivanov (1):
>   libavfilter: temporarily remove DNN framework and vf_sr filter
> 
>  Changelog| 1 -
>  configure| 8 -
>  libavfilter/Makefile | 3 -
>  libavfilter/allfilters.c | 1 -
>  libavfilter/dnn_backend_native.c |   495 --
>  libavfilter/dnn_backend_native.h |40 -
>  libavfilter/dnn_backend_tf.c |   325 -
>  libavfilter/dnn_backend_tf.h |40 -
>  libavfilter/dnn_espcn.h  | 12637 -
>  libavfilter/dnn_interface.c  |60 -
>  libavfilter/dnn_interface.h  |63 -
>  libavfilter/dnn_srcnn.h  |  4957 ---
>  libavfilter/vf_sr.c  |   354 -
>  13 files changed, 18984 deletions(-)
>  delete mode 100644 libavfilter/dnn_backend_native.c
>  delete mode 100644 libavfilter/dnn_backend_native.h
>  delete mode 100644 libavfilter/dnn_backend_tf.c
>  delete mode 100644 libavfilter/dnn_backend_tf.h
>  delete mode 100644 libavfilter/dnn_espcn.h
>  delete mode 100644 libavfilter/dnn_interface.c
>  delete mode 100644 libavfilter/dnn_interface.h
>  delete mode 100644 libavfilter/dnn_srcnn.h
>  delete mode 100644 libavfilter/vf_sr.c
> 

-Thilo
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] libavfilter: temporarily remove DNN framework and vf_sr filter

2018-07-26 Thread Rostislav Pehlivanov
As discussed recently, the vf_sr filter and the DNN framework have an
issue: unreproducable weights and questionable license, as well as
overall unfitting coding style to the rest of the project.

The vf_sr filter in particular has weights embedded which weight the
libavfilter binary by a bit and cannot currently be reproduced.
There's an overall consensus that NN filters should accept external
weights only, as the nnedi filter currently does.

So, temporarily remove both until the coding style issues have been
fixed with the framework and the filter has been modified to accept
external weights.

Also, there's a discussion by the Debian folks as to whether to treat
pretrained NNs as non-free[0], hence its not just our project that's
affected by the questionable license of distributing pretrained NN
weights.

Due to the weight of the patch (more than 1mb!) I've uploaded it to
https://0x0.st/sVEH.patch if anyone wants to test it. The change stat
is printed below.

[0]: https://lwn.net/Articles/760142/

Signed-off-by: Rostislav Pehlivanov 

Rostislav Pehlivanov (1):
  libavfilter: temporarily remove DNN framework and vf_sr filter

 Changelog| 1 -
 configure| 8 -
 libavfilter/Makefile | 3 -
 libavfilter/allfilters.c | 1 -
 libavfilter/dnn_backend_native.c |   495 --
 libavfilter/dnn_backend_native.h |40 -
 libavfilter/dnn_backend_tf.c |   325 -
 libavfilter/dnn_backend_tf.h |40 -
 libavfilter/dnn_espcn.h  | 12637 -
 libavfilter/dnn_interface.c  |60 -
 libavfilter/dnn_interface.h  |63 -
 libavfilter/dnn_srcnn.h  |  4957 ---
 libavfilter/vf_sr.c  |   354 -
 13 files changed, 18984 deletions(-)
 delete mode 100644 libavfilter/dnn_backend_native.c
 delete mode 100644 libavfilter/dnn_backend_native.h
 delete mode 100644 libavfilter/dnn_backend_tf.c
 delete mode 100644 libavfilter/dnn_backend_tf.h
 delete mode 100644 libavfilter/dnn_espcn.h
 delete mode 100644 libavfilter/dnn_interface.c
 delete mode 100644 libavfilter/dnn_interface.h
 delete mode 100644 libavfilter/dnn_srcnn.h
 delete mode 100644 libavfilter/vf_sr.c

-- 
2.18.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/2] lavc/amfenc: moving amf common code (library and context) to lavu/hwcontext_amf from amfenc to be reused in other amf components

2018-07-26 Thread Alexander Kravchenko
Hello.
It is reminder.
Could you please review the patch? if it is ok, could you apply it?
It was published 2 weeks ago and it is required for further updates

Thanks,
Alexander






___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] diracdec: add 10-bit Haar SIMD functions

2018-07-26 Thread Rostislav Pehlivanov
On 26 July 2018 at 12:28, James Darnley  wrote:

> +
> +%macro HAAR_HORIZONTAL 0
> +
> +cglobal horizontal_compose_haar_10bit, 3, 6+ARCH_X86_64, 4, b, temp_, w,
> x, b2
> +DECLARE_REG_TMP 2,5
> +%if ARCH_X86_64
> +%define tail r6d
> +%else
> +%define tail dword wm
> +%endif
> +
>

You can remove this whole bit, the init function only gets called if
ARCH_X86_64 is true.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] diracdec: add 10-bit Haar SIMD functions

2018-07-26 Thread Rostislav Pehlivanov
On 26 July 2018 at 12:28, James Darnley  wrote:

> Speed of ffmpeg when decoding a 720p yuv422p10 file encoded with the
> relevant transform.
> C:119fps
> SSE2: 204fps
> AVX:  206fps
> AVX2: 221fps
>
> timer measurements, haar horizontal compose:
> sse2: 3.68x faster (45143 vs. 12279 decicycles) compared with C
> avx:  3.68x faster (45143 vs. 12275 decicycles) compared with C
> avx2: 5.16x faster (45143 vs.  8742 decicycles) compared with C
> haar vertical compose:
> sse2: 1.64x faster (31792 vs. 19377 decicycles) compared with C
> avx:  1.58x faster (31792 vs. 20090 decicycles) compared with C
> avx2: 1.66x faster (31792 vs. 19157 decicycles) compared with C
> ---
>  libavcodec/dirac_dwt.c|   7 +-
>  libavcodec/dirac_dwt.h|   1 +
>  libavcodec/x86/Makefile   |   6 +-
>  libavcodec/x86/dirac_dwt_10bit.asm| 160 ++
>  libavcodec/x86/dirac_dwt_init_10bit.c |  76 
>  5 files changed, 247 insertions(+), 3 deletions(-)
>  create mode 100644 libavcodec/x86/dirac_dwt_10bit.asm
>  create mode 100644 libavcodec/x86/dirac_dwt_init_10bit.c
>
> diff --git a/libavcodec/dirac_dwt.c b/libavcodec/dirac_dwt.c
> index cc08f8865a..86bee5bb9b 100644
> --- a/libavcodec/dirac_dwt.c
> +++ b/libavcodec/dirac_dwt.c
> @@ -59,8 +59,13 @@ int ff_spatial_idwt_init(DWTContext *d, DWTPlane *p,
> enum dwt_type type,
>  return AVERROR_INVALIDDATA;
>  }
>
> -if (ARCH_X86 && bit_depth == 8)
> +#if ARCH_X86
> +if (bit_depth == 8)
>  ff_spatial_idwt_init_x86(d, type);
> +else if (bit_depth == 10)
> +ff_spatial_idwt_init_10bit_x86(d, type);
> +#endif
> +
>  return 0;
>  }
>
> diff --git a/libavcodec/dirac_dwt.h b/libavcodec/dirac_dwt.h
> index 994dc21d70..1ad7b9a821 100644
> --- a/libavcodec/dirac_dwt.h
> +++ b/libavcodec/dirac_dwt.h
> @@ -88,6 +88,7 @@ enum dwt_type {
>  int ff_spatial_idwt_init(DWTContext *d, DWTPlane *p, enum dwt_type type,
>   int decomposition_count, int bit_depth);
>  void ff_spatial_idwt_init_x86(DWTContext *d, enum dwt_type type);
> +void ff_spatial_idwt_init_10bit_x86(DWTContext *d, enum dwt_type type);
>
>  void ff_spatial_idwt_slice2(DWTContext *d, int y);
>
> diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
> index 2350c8bbee..590d83c167 100644
> --- a/libavcodec/x86/Makefile
> +++ b/libavcodec/x86/Makefile
> @@ -7,7 +7,8 @@ OBJS-$(CONFIG_BLOCKDSP)+=
> x86/blockdsp_init.o
>  OBJS-$(CONFIG_BSWAPDSP)+= x86/bswapdsp_init.o
>  OBJS-$(CONFIG_DCT) += x86/dct_init.o
>  OBJS-$(CONFIG_DIRAC_DECODER)   += x86/diracdsp_init.o   \
> -  x86/dirac_dwt_init.o
> +  x86/dirac_dwt_init.o \
> +  x86/dirac_dwt_init_10bit.o
>  OBJS-$(CONFIG_FDCTDSP) += x86/fdctdsp_init.o
>  OBJS-$(CONFIG_FFT) += x86/fft_init.o
>  OBJS-$(CONFIG_FLACDSP) += x86/flacdsp_init.o
> @@ -153,7 +154,8 @@ X86ASM-OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp.o
>  X86ASM-OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsidct.o
>  X86ASM-OBJS-$(CONFIG_DCA_DECODER)  += x86/dcadsp.o x86/synth_filter.o
>  X86ASM-OBJS-$(CONFIG_DIRAC_DECODER)+= x86/diracdsp.o\
> -  x86/dirac_dwt.o
> +  x86/dirac_dwt.o \
> +  x86/dirac_dwt_10bit.o
>  X86ASM-OBJS-$(CONFIG_DNXHD_ENCODER)+= x86/dnxhdenc.o
>  X86ASM-OBJS-$(CONFIG_EXR_DECODER)  += x86/exrdsp.o
>  X86ASM-OBJS-$(CONFIG_FLAC_DECODER) += x86/flacdsp.o
> diff --git a/libavcodec/x86/dirac_dwt_10bit.asm
> b/libavcodec/x86/dirac_dwt_10bit.asm
> new file mode 100644
> index 00..baea91329e
> --- /dev/null
> +++ b/libavcodec/x86/dirac_dwt_10bit.asm
> @@ -0,0 +1,160 @@
> +;**
> 
> +;* x86 optimized discrete 10-bit wavelet trasnform
> +;* Copyright (c) 2018 James Darnley
> +;*
> +;* This file is part of FFmpeg.
> +;*
> +;* FFmpeg is free software; you can redistribute it and/or
> +;* modify it under the terms of the GNU Lesser General Public
> +;* License as published by the Free Software Foundation; either
> +;* version 2.1 of the License, or (at your option) any later version.
> +;*
> +;* FFmpeg is distributed in the hope that it will be useful,
> +;* but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +;* Lesser General Public License for more details.
> +;*
> +;* You should have received a copy of the GNU Lesser General Public
> +;* License along with FFmpeg; if not, write to the Free Software
> +;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> 

Re: [FFmpeg-devel] [PATCH] Support for Ambisonics and OpusProjection* API.

2018-07-26 Thread Vittorio Giovara
On Thu, Jul 26, 2018 at 4:15 PM, Rostislav Pehlivanov 
wrote:

> Hey,
>
> As of now, the ambisonics API is enabled by default in libopus. We still
> don't have a way to signal ambisonics yet.
> We still have plenty of bits left in libavutil/channel_layout.h to signal
> many orders of ambisonics but some people have had opinions against
> extending that API. We could instead extend AVMatrixEncoding but I don't
> think that's entirely appropriate.
> What opinions do people have on this?
>

I had been working on a new API that would encompass ambisonic ordering
(see
https://github.com/kodabb/libav/commit/98d9b0a7b28525b29e40ae4c564e51e7c94449eb).
The downside is that it requires updating the whole channel layout API (see
https://github.com/kodabb/libav/commit/c023b553e6ad6da5af6d0d4ff067ff844b2fcfac
)
I got it mostly working but ran into issues during backward compatibility,
and didn't have time to debug and fix it.

If anyone wants to finish the set, backport it, and add the missing lswr
part it would be easy work. I'm available to help in the process just to
get this completed.
The full branch is available at https://github.com/kodabb/libav/commits/chl
(I hope this will be a mature discussion even though the patches belong to
another tree).
-- 
Vittorio
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Support for Ambisonics and OpusProjection* API.

2018-07-26 Thread Rostislav Pehlivanov
Hey,

As of now, the ambisonics API is enabled by default in libopus. We still
don't have a way to signal ambisonics yet.
We still have plenty of bits left in libavutil/channel_layout.h to signal
many orders of ambisonics but some people have had opinions against
extending that API. We could instead extend AVMatrixEncoding but I don't
think that's entirely appropriate.
What opinions do people have on this?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 0/1] libavformat/dashenc: Fix relative URI of HLS master playlist

2018-07-26 Thread Antonio Morell
When using the DASH muxer to produce a segmented media stream, enabling the
setting "hls_playlist" yields also HLS-compatible master and playlist manifest
files.
However, the relative URI of the master playlist is not formed as expected,
since an extra slash preceeds the file name, i.e., 
::///PATH//master.m3u8 
is generated instead of the expected 
::///PATH/master.m3u8.
This patch just removes the extra slash preceeding the file name.
As can be seen at line 341, media playlists are properly produced without
placing the extra preceeding slash.
The resulting URI of the master.m3u8 file is now consistent with the other
assets yielded by the muxer.

Antonio Morell (1):
  libavformat/dashenc: Fix relative URI of HLS master playlist

 libavformat/dashenc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.15.2 (Apple Git-101.1)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/1] libavformat/dashenc: Fix relative URI of HLS master playlist

2018-07-26 Thread Antonio Morell
---
 libavformat/dashenc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c
index a9b8b1d4f6..ae57fd5493 100644
--- a/libavformat/dashenc.c
+++ b/libavformat/dashenc.c
@@ -868,7 +868,7 @@ static int write_manifest(AVFormatContext *s, int final)
 int max_audio_bitrate = 0;
 
 if (*c->dirname)
-snprintf(filename_hls, sizeof(filename_hls), "%s/master.m3u8", 
c->dirname);
+snprintf(filename_hls, sizeof(filename_hls), "%smaster.m3u8", 
c->dirname);
 else
 snprintf(filename_hls, sizeof(filename_hls), "master.m3u8");
 
-- 
2.15.2 (Apple Git-101.1)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] About the maintainer of mips

2018-07-26 Thread yinshiyou...@loongson.cn
>> I hered from the previous mantainer for mips that he was no longer part of
>> mips company,and as a result, my patch was still pending review.
>> Will ffmpeg community asign new mantainer for mips?
> 
>No, you have to send a patch that changes the maintainership to you,
>see MAINTAINERS in the main directory.

Thank you very much for your reply. I send a patch to add myself to mips 
section today : )



姓名:殷时友
电话:153 0560 8910
邮箱:yinshiyou...@loongson.cn
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/3] diracdec: add 10-bit Haar SIMD functions

2018-07-26 Thread James Darnley
Speed of ffmpeg when decoding a 720p yuv422p10 file encoded with the
relevant transform.
C:119fps
SSE2: 204fps
AVX:  206fps
AVX2: 221fps

timer measurements, haar horizontal compose:
sse2: 3.68x faster (45143 vs. 12279 decicycles) compared with C
avx:  3.68x faster (45143 vs. 12275 decicycles) compared with C
avx2: 5.16x faster (45143 vs.  8742 decicycles) compared with C
haar vertical compose:
sse2: 1.64x faster (31792 vs. 19377 decicycles) compared with C
avx:  1.58x faster (31792 vs. 20090 decicycles) compared with C
avx2: 1.66x faster (31792 vs. 19157 decicycles) compared with C
---
 libavcodec/dirac_dwt.c|   7 +-
 libavcodec/dirac_dwt.h|   1 +
 libavcodec/x86/Makefile   |   6 +-
 libavcodec/x86/dirac_dwt_10bit.asm| 160 ++
 libavcodec/x86/dirac_dwt_init_10bit.c |  76 
 5 files changed, 247 insertions(+), 3 deletions(-)
 create mode 100644 libavcodec/x86/dirac_dwt_10bit.asm
 create mode 100644 libavcodec/x86/dirac_dwt_init_10bit.c

diff --git a/libavcodec/dirac_dwt.c b/libavcodec/dirac_dwt.c
index cc08f8865a..86bee5bb9b 100644
--- a/libavcodec/dirac_dwt.c
+++ b/libavcodec/dirac_dwt.c
@@ -59,8 +59,13 @@ int ff_spatial_idwt_init(DWTContext *d, DWTPlane *p, enum 
dwt_type type,
 return AVERROR_INVALIDDATA;
 }
 
-if (ARCH_X86 && bit_depth == 8)
+#if ARCH_X86
+if (bit_depth == 8)
 ff_spatial_idwt_init_x86(d, type);
+else if (bit_depth == 10)
+ff_spatial_idwt_init_10bit_x86(d, type);
+#endif
+
 return 0;
 }
 
diff --git a/libavcodec/dirac_dwt.h b/libavcodec/dirac_dwt.h
index 994dc21d70..1ad7b9a821 100644
--- a/libavcodec/dirac_dwt.h
+++ b/libavcodec/dirac_dwt.h
@@ -88,6 +88,7 @@ enum dwt_type {
 int ff_spatial_idwt_init(DWTContext *d, DWTPlane *p, enum dwt_type type,
  int decomposition_count, int bit_depth);
 void ff_spatial_idwt_init_x86(DWTContext *d, enum dwt_type type);
+void ff_spatial_idwt_init_10bit_x86(DWTContext *d, enum dwt_type type);
 
 void ff_spatial_idwt_slice2(DWTContext *d, int y);
 
diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
index 2350c8bbee..590d83c167 100644
--- a/libavcodec/x86/Makefile
+++ b/libavcodec/x86/Makefile
@@ -7,7 +7,8 @@ OBJS-$(CONFIG_BLOCKDSP)+= x86/blockdsp_init.o
 OBJS-$(CONFIG_BSWAPDSP)+= x86/bswapdsp_init.o
 OBJS-$(CONFIG_DCT) += x86/dct_init.o
 OBJS-$(CONFIG_DIRAC_DECODER)   += x86/diracdsp_init.o   \
-  x86/dirac_dwt_init.o
+  x86/dirac_dwt_init.o \
+  x86/dirac_dwt_init_10bit.o
 OBJS-$(CONFIG_FDCTDSP) += x86/fdctdsp_init.o
 OBJS-$(CONFIG_FFT) += x86/fft_init.o
 OBJS-$(CONFIG_FLACDSP) += x86/flacdsp_init.o
@@ -153,7 +154,8 @@ X86ASM-OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp.o
 X86ASM-OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsidct.o
 X86ASM-OBJS-$(CONFIG_DCA_DECODER)  += x86/dcadsp.o x86/synth_filter.o
 X86ASM-OBJS-$(CONFIG_DIRAC_DECODER)+= x86/diracdsp.o\
-  x86/dirac_dwt.o
+  x86/dirac_dwt.o \
+  x86/dirac_dwt_10bit.o
 X86ASM-OBJS-$(CONFIG_DNXHD_ENCODER)+= x86/dnxhdenc.o
 X86ASM-OBJS-$(CONFIG_EXR_DECODER)  += x86/exrdsp.o
 X86ASM-OBJS-$(CONFIG_FLAC_DECODER) += x86/flacdsp.o
diff --git a/libavcodec/x86/dirac_dwt_10bit.asm 
b/libavcodec/x86/dirac_dwt_10bit.asm
new file mode 100644
index 00..baea91329e
--- /dev/null
+++ b/libavcodec/x86/dirac_dwt_10bit.asm
@@ -0,0 +1,160 @@
+;**
+;* x86 optimized discrete 10-bit wavelet trasnform
+;* Copyright (c) 2018 James Darnley
+;*
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free software; you can redistribute it and/or
+;* modify it under the terms of the GNU Lesser General Public
+;* License as published by the Free Software Foundation; either
+;* version 2.1 of the License, or (at your option) any later version.
+;*
+;* FFmpeg is distributed in the hope that it will be useful,
+;* but WITHOUT ANY WARRANTY; without even the implied warranty of
+;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;* Lesser General Public License for more details.
+;*
+;* You should have received a copy of the GNU Lesser General Public
+;* License along with FFmpeg; if not, write to the Free Software
+;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+;**
+
+%include "libavutil/x86/x86util.asm"
+
+SECTION_RODATA
+
+cextern pd_1
+
+SECTION .text
+
+%macro HAAR_VERTICAL 0
+
+cglobal vertical_compose_haar_10bit, 3, 6, 4, b0, b1, w
+DECLARE_REG_TMP 

[FFmpeg-devel] [PATCH 0/3 v2] x86 SIMD for dirac 10-bit wavelet transforms

2018-07-26 Thread James Darnley
I will ask the same question as last time.  Is the AVX worth it in Haar?  Also I
am surprised that the AVX2 doesn't have a bigger difference on some of the
vertical transforms.

James Darnley (3):
  diracdec: add 10-bit Haar SIMD functions
  diracdec: add 10-bit Legall 5,3 (5_3) SIMD functions
  diracdec: add 10-bit Deslauriers-Dubuc 9,7 (9_7) vertical high-pass
function

 libavcodec/dirac_dwt.c|   7 +-
 libavcodec/dirac_dwt.h|   1 +
 libavcodec/x86/Makefile   |   6 +-
 libavcodec/x86/dirac_dwt_10bit.asm| 302 ++
 libavcodec/x86/dirac_dwt_init_10bit.c | 118 ++
 5 files changed, 431 insertions(+), 3 deletions(-)
 create mode 100644 libavcodec/x86/dirac_dwt_10bit.asm
 create mode 100644 libavcodec/x86/dirac_dwt_init_10bit.c

-- 
2.18.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 3/3] diracdec: add 10-bit Deslauriers-Dubuc 9, 7 (9_7) vertical high-pass function

2018-07-26 Thread James Darnley
Speed of ffmpeg when decoding a 720p yuv422p10 file encoded with the
relevant transform.
C: 84fps
SSE2: 111fps
AVX2: 115fps

dd97 vertical hi
sse2: 2.77x faster (31773 vs. 11457 decicycles) compared with C
avx2: 3.83x faster (31773 vs.  8297 decicycles) compared with C
---
 libavcodec/x86/dirac_dwt_10bit.asm| 39 +++
 libavcodec/x86/dirac_dwt_init_10bit.c | 29 
 2 files changed, 68 insertions(+)

diff --git a/libavcodec/x86/dirac_dwt_10bit.asm 
b/libavcodec/x86/dirac_dwt_10bit.asm
index 0295e6f554..2ed77fe3b0 100644
--- a/libavcodec/x86/dirac_dwt_10bit.asm
+++ b/libavcodec/x86/dirac_dwt_10bit.asm
@@ -25,6 +25,7 @@ SECTION_RODATA 32
 
 cextern pd_1
 pd_2: times 8 dd 2
+pd_8: times 8 dd 8
 
 SECTION .text
 
@@ -246,7 +247,44 @@ RET
 
 %endmacro
 
+%macro DD97_VERTICAL_HI 0
+
+cglobal dd97_vertical_hi, 6, 6, 8, b0, b1, b2, b3, b4, w
+mova m7, [pd_8]
+shl wd, 2
+add b0q, wq
+add b1q, wq
+add b2q, wq
+add b3q, wq
+add b4q, wq
+neg wq
+
+ALIGN 16
+.loop:
+mova m0, [b0q + wq]
+mova m1, [b1q + wq]
+mova m2, [b2q + wq]
+mova m3, [b3q + wq]
+mova m4, [b4q + wq]
+pslld m5, m1, 3
+pslld m6, m3, 3
+paddd m5, m1
+paddd m6, m3
+psubd m5, m0
+psubd m6, m4
+paddd m5, m7
+paddd m5, m6
+psrad m5, 4
+paddd m2, m5
+mova [b2q + wq], m2
+add wq, mmsize
+jl .loop
+RET
+
+%endmacro
+
 INIT_XMM sse2
+DD97_VERTICAL_HI
 HAAR_HORIZONTAL
 HAAR_VERTICAL
 LEGALL53_VERTICAL_HI
@@ -257,6 +295,7 @@ HAAR_HORIZONTAL
 HAAR_VERTICAL
 
 INIT_YMM avx2
+DD97_VERTICAL_HI
 HAAR_HORIZONTAL
 HAAR_VERTICAL
 LEGALL53_VERTICAL_HI
diff --git a/libavcodec/x86/dirac_dwt_init_10bit.c 
b/libavcodec/x86/dirac_dwt_init_10bit.c
index d1234efac5..a9ac603bc5 100644
--- a/libavcodec/x86/dirac_dwt_init_10bit.c
+++ b/libavcodec/x86/dirac_dwt_init_10bit.c
@@ -23,6 +23,9 @@
 #include "libavutil/x86/cpu.h"
 #include "libavcodec/dirac_dwt.h"
 
+void ff_dd97_vertical_hi_sse2(int32_t *b0, int32_t *b1, int32_t *b2, int32_t 
*b3, int32_t *b4, int width);
+void ff_dd97_vertical_hi_avx2(int32_t *b0, int32_t *b1, int32_t *b2, int32_t 
*b3, int32_t *b4, int width);
+
 void ff_legall53_vertical_hi_sse2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
 void ff_legall53_vertical_lo_sse2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
 void ff_legall53_vertical_hi_avx2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
@@ -36,6 +39,24 @@ void ff_vertical_compose_haar_10bit_sse2(int32_t *b0, 
int32_t *b1, int width_ali
 void ff_vertical_compose_haar_10bit_avx(int32_t *b0, int32_t *b1, int 
width_align);
 void ff_vertical_compose_haar_10bit_avx2(int32_t *b0, int32_t *b1, int 
width_align);
 
+static void dd97_vertical_hi_sse2(int32_t *b0, int32_t *b1, int32_t *b2,
+  int32_t *b3, int32_t *b4, int width)
+{
+int i = width & ~3;
+ff_dd97_vertical_hi_sse2(b0, b1, b2, b3, b4, i);
+for(; ivertical_compose_h0 = (void*)dd97_vertical_hi_sse2;
+d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_sse2;
+break;
 case DWT_DIRAC_LEGALL5_3:
 d->vertical_compose_h0 = (void*)ff_legall53_vertical_hi_sse2;
 d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_sse2;
@@ -71,6 +96,10 @@ av_cold void ff_spatial_idwt_init_10bit_x86(DWTContext *d, 
enum dwt_type type)
 
 if (EXTERNAL_AVX2(cpu_flags)) {
 switch (type) {
+case DWT_DIRAC_DD9_7:
+d->vertical_compose_h0 = (void*)dd97_vertical_hi_avx2;
+d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_avx2;
+break;
 case DWT_DIRAC_LEGALL5_3:
 d->vertical_compose_h0 = (void*)ff_legall53_vertical_hi_avx2;
 d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_avx2;
-- 
2.18.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/3] diracdec: add 10-bit Legall 5, 3 (5_3) SIMD functions

2018-07-26 Thread James Darnley
Speed of ffmpeg when decoding a 720p yuv422p10 file encoded with the
relevant transform.
C: 94fps
SSE2: 118fps
AVX2: 121fps

legall vertical hi
sse2: 3.86x faster (20201 vs. 5231 decicycles) compared with C
avx2: 6.70x faster (20201 vs. 3014 decicycles) compared with C
legall vertical lo
sse2: 1.50x faster (28345 vs. 18908 decicycles) compared with C
avx2: 1.63x faster (28345 vs. 17361 decicycles) compared with C
---
 libavcodec/x86/dirac_dwt_10bit.asm| 105 +-
 libavcodec/x86/dirac_dwt_init_10bit.c |  13 
 2 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/libavcodec/x86/dirac_dwt_10bit.asm 
b/libavcodec/x86/dirac_dwt_10bit.asm
index baea91329e..0295e6f554 100644
--- a/libavcodec/x86/dirac_dwt_10bit.asm
+++ b/libavcodec/x86/dirac_dwt_10bit.asm
@@ -21,9 +21,10 @@
 
 %include "libavutil/x86/x86util.asm"
 
-SECTION_RODATA
+SECTION_RODATA 32
 
 cextern pd_1
+pd_2: times 8 dd 2
 
 SECTION .text
 
@@ -147,9 +148,109 @@ REP_RET
 
 %endmacro
 
+%macro LEGALL53_VERTICAL_LO 0
+
+cglobal legall53_vertical_lo, 4, 6, 4, b0, b1, b2, w
+DECLARE_REG_TMP 3,4,5
+
+mova  m3, [pd_2]
+mov  t2d, wd
+and   wd, ~(mmsize/4 - 1)
+shl   wd, 2
+add  b0q, wq
+add  b1q, wq
+add  b2q, wq
+neg   wq
+
+ALIGN 16
+.loop:
+mova m0, [b0q + wq]
+mova m1, [b1q + wq]
+mova m2, [b2q + wq]
+paddd m0, m2
+paddd m0, m3
+psrad m0, 2
+psubd m1, m0
+mova [b1q + wq], m1
+add wq, mmsize
+jl .loop
+
+and  t2d, mmsize/4 - 1
+jz .end
+.loop_scalar:
+mov t0d, [b0q]
+mov t1d, [b1q]
+add t0d, [b2q]
+add t0d, 2
+sar t0d, 2
+sub t1d, t0d
+mov [b1q], t1d
+
+add b0q, 4
+add b1q, 4
+add b2q, 4
+sub t2d, 1
+jg .loop_scalar
+
+.end:
+RET
+
+%endmacro
+
+%macro LEGALL53_VERTICAL_HI 0
+
+cglobal legall53_vertical_hi, 4, 6, 4, b0, b1, b2, w
+DECLARE_REG_TMP 3,4,5
+
+mova  m3, [pd_1]
+mov  t2d, wd
+and   wd, ~(mmsize/4 - 1)
+shl   wd, 2
+add  b0q, wq
+add  b1q, wq
+add  b2q, wq
+neg   wq
+
+ALIGN 16
+.loop:
+mova m0, [b0q + wq]
+mova m1, [b1q + wq]
+mova m2, [b2q + wq]
+paddd m0, m2
+paddd m0, m3
+psrad m0, 1
+paddd m1, m0
+mova [b1q + wq], m1
+add wq, mmsize
+jl .loop
+
+and  t2d, mmsize/4 - 1
+jz .end
+.loop_scalar:
+mov t0d, [b0q]
+mov t1d, [b1q]
+add t0d, [b2q]
+add t0d, 1
+sar t0d, 1
+add t1d, t0d
+mov [b1q], t1d
+
+add b0q, 4
+add b1q, 4
+add b2q, 4
+sub t2d, 1
+jg .loop_scalar
+
+.end:
+RET
+
+%endmacro
+
 INIT_XMM sse2
 HAAR_HORIZONTAL
 HAAR_VERTICAL
+LEGALL53_VERTICAL_HI
+LEGALL53_VERTICAL_LO
 
 INIT_XMM avx
 HAAR_HORIZONTAL
@@ -158,3 +259,5 @@ HAAR_VERTICAL
 INIT_YMM avx2
 HAAR_HORIZONTAL
 HAAR_VERTICAL
+LEGALL53_VERTICAL_HI
+LEGALL53_VERTICAL_LO
diff --git a/libavcodec/x86/dirac_dwt_init_10bit.c 
b/libavcodec/x86/dirac_dwt_init_10bit.c
index 289862d728..d1234efac5 100644
--- a/libavcodec/x86/dirac_dwt_init_10bit.c
+++ b/libavcodec/x86/dirac_dwt_init_10bit.c
@@ -23,6 +23,11 @@
 #include "libavutil/x86/cpu.h"
 #include "libavcodec/dirac_dwt.h"
 
+void ff_legall53_vertical_hi_sse2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
+void ff_legall53_vertical_lo_sse2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
+void ff_legall53_vertical_hi_avx2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
+void ff_legall53_vertical_lo_avx2(int32_t *b0, int32_t *b1, int32_t *b2, int 
width);
+
 void ff_horizontal_compose_haar_10bit_sse2(int32_t *b0, int32_t *b1, int 
width_align);
 void ff_horizontal_compose_haar_10bit_avx(int32_t *b0, int32_t *b1, int 
width_align);
 void ff_horizontal_compose_haar_10bit_avx2(int32_t *b0, int32_t *b1, int 
width_align);
@@ -38,6 +43,10 @@ av_cold void ff_spatial_idwt_init_10bit_x86(DWTContext *d, 
enum dwt_type type)
 
 if (EXTERNAL_SSE2(cpu_flags)) {
 switch (type) {
+case DWT_DIRAC_LEGALL5_3:
+d->vertical_compose_h0 = (void*)ff_legall53_vertical_hi_sse2;
+d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_sse2;
+break;
 case DWT_DIRAC_HAAR0:
 d->vertical_compose = 
(void*)ff_vertical_compose_haar_10bit_sse2;
 break;
@@ -62,6 +71,10 @@ av_cold void ff_spatial_idwt_init_10bit_x86(DWTContext *d, 
enum dwt_type type)
 
 if (EXTERNAL_AVX2(cpu_flags)) {
 switch (type) {
+case DWT_DIRAC_LEGALL5_3:
+d->vertical_compose_h0 = (void*)ff_legall53_vertical_hi_avx2;
+d->vertical_compose_l0 = (void*)ff_legall53_vertical_lo_avx2;
+break;
 case DWT_DIRAC_HAAR0:
 d->vertical_compose = 

[FFmpeg-devel] [PATCH] avformat/librtmp: fix returning EOF from Read/Write

2018-07-26 Thread Timo Rothenpieler
---
 libavformat/librtmp.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/libavformat/librtmp.c b/libavformat/librtmp.c
index f3cfa9a8e2..43013e46e0 100644
--- a/libavformat/librtmp.c
+++ b/libavformat/librtmp.c
@@ -261,7 +261,10 @@ static int rtmp_write(URLContext *s, const uint8_t *buf, 
int size)
 LibRTMPContext *ctx = s->priv_data;
 RTMP *r = >rtmp;
 
-return RTMP_Write(r, buf, size);
+int ret = RTMP_Write(r, buf, size);
+if (!ret)
+return AVERROR_EOF;
+return ret;
 }
 
 static int rtmp_read(URLContext *s, uint8_t *buf, int size)
@@ -269,7 +272,10 @@ static int rtmp_read(URLContext *s, uint8_t *buf, int size)
 LibRTMPContext *ctx = s->priv_data;
 RTMP *r = >rtmp;
 
-return RTMP_Read(r, buf, size);
+int ret = RTMP_Read(r, buf, size);
+if (!ret)
+return AVERROR_EOF;
+return ret;
 }
 
 static int rtmp_read_pause(URLContext *s, int pause)
-- 
2.18.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/2] lavc/encode: fix frame_number double-counted

2018-07-26 Thread Zhong Li
Encoder frame_number may be double-counted if some frames are cached and then 
flushed.
Take qsv encoder (some frames are cached firsty for asynchronism) as example,
./ffmpeg -loglevel verbose -hwaccel qsv -c:v h264_qsv -i in.mp4 -vframes 100 
-c:v h264_qsv out.mp4
frame_number passed to encoder is double-counted and larger than the accurate 
value.
Libx264 encoding with B frames can also reproduce it.

Signed-off-by: Zhong Li 
---
 libavcodec/encode.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libavcodec/encode.c b/libavcodec/encode.c
index d976151..98c44c3 100644
--- a/libavcodec/encode.c
+++ b/libavcodec/encode.c
@@ -235,8 +235,8 @@ int attribute_align_arg 
avcodec_encode_audio2(AVCodecContext *avctx,
 if (ret >= 0)
 avpkt->data = avpkt->buf->data;
 }
-
-avctx->frame_number++;
+if (frame)
+avctx->frame_number++;
 }
 
 if (ret < 0 || !*got_packet_ptr) {
@@ -333,7 +333,8 @@ int attribute_align_arg 
avcodec_encode_video2(AVCodecContext *avctx,
 avpkt->data = avpkt->buf->data;
 }
 
-avctx->frame_number++;
+if (frame)
+avctx->frame_number++;
 }
 
 if (ret < 0 || !*got_packet_ptr)
-- 
2.7.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/2] lavc/qsvenc: expose qp of encoded frames

2018-07-26 Thread Zhong Li
Requirement from ticket #7254.
Currently only H264 supported by MSDK.

Signed-off-by: Zhong Li 
---
 libavcodec/qsvenc.c  | 43 +++
 libavcodec/qsvenc.h  |  2 ++
 libavcodec/qsvenc_h264.c |  5 +
 3 files changed, 50 insertions(+)

diff --git a/libavcodec/qsvenc.c b/libavcodec/qsvenc.c
index 8096945..1294ed2 100644
--- a/libavcodec/qsvenc.c
+++ b/libavcodec/qsvenc.c
@@ -1139,6 +1139,10 @@ static int encode_frame(AVCodecContext *avctx, 
QSVEncContext *q,
 {
 AVPacket new_pkt = { 0 };
 mfxBitstream *bs;
+#if QSV_VERSION_ATLEAST(1, 26)
+mfxExtAVCEncodedFrameInfo *enc_info;
+mfxExtBuffer **enc_buf;
+#endif
 
 mfxFrameSurface1 *surf = NULL;
 mfxSyncPoint *sync = NULL;
@@ -1172,6 +1176,22 @@ static int encode_frame(AVCodecContext *avctx, 
QSVEncContext *q,
 bs->Data  = new_pkt.data;
 bs->MaxLength = new_pkt.size;
 
+#if QSV_VERSION_ATLEAST(1, 26)
+if (avctx->codec_id == AV_CODEC_ID_H264) {
+enc_info = av_mallocz(sizeof(*enc_info));
+if (!enc_info)
+return AVERROR(ENOMEM);
+
+enc_info->Header.BufferId = MFX_EXTBUFF_ENCODED_FRAME_INFO;
+enc_info->Header.BufferSz = sizeof (*enc_info);
+bs->NumExtParam = 1;
+enc_buf = av_mallocz(sizeof(mfxExtBuffer *));
+enc_buf[0] = (mfxExtBuffer *)enc_info;
+
+bs->ExtParam = enc_buf;
+}
+#endif
+
 if (q->set_encode_ctrl_cb) {
 q->set_encode_ctrl_cb(avctx, frame, _frame->enc_ctrl);
 }
@@ -1179,6 +1199,10 @@ static int encode_frame(AVCodecContext *avctx, 
QSVEncContext *q,
 sync = av_mallocz(sizeof(*sync));
 if (!sync) {
 av_freep();
+ #if QSV_VERSION_ATLEAST(1, 26)
+if (avctx->codec_id == AV_CODEC_ID_H264)
+av_freep(_info);
+ #endif
 av_packet_unref(_pkt);
 return AVERROR(ENOMEM);
 }
@@ -1195,6 +1219,10 @@ static int encode_frame(AVCodecContext *avctx, 
QSVEncContext *q,
 if (ret < 0) {
 av_packet_unref(_pkt);
 av_freep();
+#if QSV_VERSION_ATLEAST(1, 26)
+if (avctx->codec_id == AV_CODEC_ID_H264)
+av_freep(_info);
+#endif
 av_freep();
 return (ret == MFX_ERR_MORE_DATA) ?
0 : ff_qsv_print_error(avctx, ret, "Error during encoding");
@@ -1211,6 +1239,10 @@ static int encode_frame(AVCodecContext *avctx, 
QSVEncContext *q,
 av_freep();
 av_packet_unref(_pkt);
 av_freep();
+#if QSV_VERSION_ATLEAST(1, 26)
+if (avctx->codec_id == AV_CODEC_ID_H264)
+av_freep(_info);
+#endif
 }
 
 return 0;
@@ -1230,6 +1262,9 @@ int ff_qsv_encode(AVCodecContext *avctx, QSVEncContext *q,
 AVPacket new_pkt;
 mfxBitstream *bs;
 mfxSyncPoint *sync;
+#if QSV_VERSION_ATLEAST(1, 26)
+mfxExtAVCEncodedFrameInfo *enc_info;
+#endif
 
 av_fifo_generic_read(q->async_fifo, _pkt, sizeof(new_pkt), NULL);
 av_fifo_generic_read(q->async_fifo, ,sizeof(sync),NULL);
@@ -1258,6 +1293,14 @@ FF_DISABLE_DEPRECATION_WARNINGS
 FF_ENABLE_DEPRECATION_WARNINGS
 #endif
 
+#if QSV_VERSION_ATLEAST(1, 26)
+if (avctx->codec_id == AV_CODEC_ID_H264) {
+enc_info = (mfxExtAVCEncodedFrameInfo *)(*bs->ExtParam);
+av_log(avctx, AV_LOG_DEBUG, "QP is %d\n", enc_info->QP);
+q->sum_frame_qp += enc_info->QP;
+av_freep(_info);
+}
+#endif
 av_freep();
 av_freep();
 
diff --git a/libavcodec/qsvenc.h b/libavcodec/qsvenc.h
index b2d6355..3784a82 100644
--- a/libavcodec/qsvenc.h
+++ b/libavcodec/qsvenc.h
@@ -102,6 +102,8 @@ typedef struct QSVEncContext {
 int width_align;
 int height_align;
 
+int sum_frame_qp;
+
 mfxVideoParam param;
 mfxFrameAllocRequest req;
 
diff --git a/libavcodec/qsvenc_h264.c b/libavcodec/qsvenc_h264.c
index 5c262e5..b87bef6 100644
--- a/libavcodec/qsvenc_h264.c
+++ b/libavcodec/qsvenc_h264.c
@@ -95,6 +95,11 @@ static av_cold int qsv_enc_close(AVCodecContext *avctx)
 {
 QSVH264EncContext *q = avctx->priv_data;
 
+#if QSV_VERSION_ATLEAST(1, 26)
+av_log(avctx, AV_LOG_VERBOSE, "encoded %d frames, avarge qp is %.2f\n",
+avctx->frame_number,(double)q->qsv.sum_frame_qp / avctx->frame_number);
+#endif
+
 return ff_qsv_enc_close(avctx, >qsv);
 }
 
-- 
2.7.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/2] tests/audiogen: raise channel count limit to 12

2018-07-26 Thread Tobias Rapp
Signed-off-by: Tobias Rapp 
---
 tests/audiogen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/audiogen.c b/tests/audiogen.c
index 8d596b5..c43bb70 100644
--- a/tests/audiogen.c
+++ b/tests/audiogen.c
@@ -26,7 +26,7 @@
 #include 
 #include 
 
-#define MAX_CHANNELS 8
+#define MAX_CHANNELS 12
 
 static unsigned int myrnd(unsigned int *seed_ptr, int n)
 {
-- 
2.7.4


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/2] fate: add tests for audio channel up-/downmixing with pan filter

2018-07-26 Thread Tobias Rapp
Add tests for upmixing and downmixing with audio channel counts that
have a corresponding default layout and also tests where there is no
default layout.

Update the existing "stereo4" test so it actually outputs stereo like
the other stereo tests. Rename the previous "stereo4" test into
"upmix1".

Signed-off-by: Tobias Rapp 
---
 tests/fate/filter-audio.mak| 22 +++-
 tests/ref/fate/filter-pan-downmix1 | 26 ++
 tests/ref/fate/filter-pan-downmix2 | 26 ++
 tests/ref/fate/filter-pan-stereo4  | 42 +++---
 .../fate/{filter-pan-stereo4 => filter-pan-upmix1} |  0
 tests/ref/fate/filter-pan-upmix2   | 26 ++
 6 files changed, 120 insertions(+), 22 deletions(-)
 create mode 100644 tests/ref/fate/filter-pan-downmix1
 create mode 100644 tests/ref/fate/filter-pan-downmix2
 copy tests/ref/fate/{filter-pan-stereo4 => filter-pan-upmix1} (100%)
 create mode 100644 tests/ref/fate/filter-pan-upmix2

diff --git a/tests/fate/filter-audio.mak b/tests/fate/filter-audio.mak
index 6125a37..473b8ae 100644
--- a/tests/fate/filter-audio.mak
+++ b/tests/fate/filter-audio.mak
@@ -156,7 +156,27 @@ fate-filter-pan-stereo3: CMD = framecrc -ss 3.14 -i $(SRC) 
-frames:a 20 -filter:
 FATE_AFILTER-$(call FILTERDEMDECENCMUX, PAN, WAV, PCM_S16LE, PCM_S16LE, WAV) 
+= fate-filter-pan-stereo4
 fate-filter-pan-stereo4: tests/data/asynth-44100-2.wav
 fate-filter-pan-stereo4: SRC = $(TARGET_PATH)/tests/data/asynth-44100-2.wav
-fate-filter-pan-stereo4: CMD = framecrc -ss 3.14 -guess_layout_max 0 -i $(SRC) 
-frames:a 20 -filter:a "pan=4C|c0=c0-0.5*c1|c1=c1+0.5*c0|c2=0*c0|c3=0*c0"
+fate-filter-pan-stereo4: CMD = framecrc -ss 3.14 -guess_layout_max 0 -i $(SRC) 
-frames:a 20 -filter:a "pan=2C|c0=c0-0.5*c1|c1=c1+0.5*c0"
+
+FATE_AFILTER-$(call FILTERDEMDECENCMUX, PAN, WAV, PCM_S16LE, PCM_S16LE, WAV) 
+= fate-filter-pan-upmix1
+fate-filter-pan-upmix1: tests/data/asynth-44100-2.wav
+fate-filter-pan-upmix1: SRC = $(TARGET_PATH)/tests/data/asynth-44100-2.wav
+fate-filter-pan-upmix1: CMD = framecrc -ss 3.14 -guess_layout_max 0 -i $(SRC) 
-frames:a 20 -filter:a "pan=4C|c0=c0-0.5*c1|c1=c1+0.5*c0|c2=0*c0|c3=0*c0"
+
+FATE_AFILTER-$(call FILTERDEMDECENCMUX, PAN, WAV, PCM_S16LE, PCM_S16LE, WAV) 
+= fate-filter-pan-upmix2
+fate-filter-pan-upmix2: tests/data/asynth-44100-4.wav
+fate-filter-pan-upmix2: SRC = $(TARGET_PATH)/tests/data/asynth-44100-4.wav
+fate-filter-pan-upmix2: CMD = framecrc -ss 3.14 -i $(SRC) -frames:a 20 
-filter:a 
"pan=9C|c0=c0-c1|c1=c2+c3|c2=c0+c1|c3=c2-c3|c4=c1-c0|c5=c3+c2|c6=c1+c0|c7=c3-c2|c8=c0-c3"
+
+FATE_AFILTER-$(call FILTERDEMDECENCMUX, PAN, WAV, PCM_S16LE, PCM_S16LE, WAV) 
+= fate-filter-pan-downmix1
+fate-filter-pan-downmix1: tests/data/asynth-44100-4.wav
+fate-filter-pan-downmix1: SRC = $(TARGET_PATH)/tests/data/asynth-44100-4.wav
+fate-filter-pan-downmix1: CMD = framecrc -ss 3.14 -i $(SRC) -frames:a 20 
-filter:a "pan=2c|FLhttp://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] doc/formats: Add documentation for skip_estimate_duration_from_pts

2018-07-26 Thread Gyan Doshi



On 26-07-2018 03:37 AM, Michael Niedermayer wrote:

On Wed, Jul 25, 2018 at 10:13:58AM +0530, Gyan Doshi wrote:



Wouldn't it be better to move this as a private option for those two
demuxers?


iam not sure, it could be used by others in the future too
what do people prefer ? I have no real oppinion on where to put it as long
as it is documented somewhere in the docs ...


Looking at the history, this function was added for MPEG-PS in 2003, and 
then used for TS in 2006. Hasn't been used for any other demuxers. You 
then added a similar function for NUT but within the demuxer.


We can leave the possible shifting for later on, but let's add the 
limitation to the doc.


Thanks,
Gyan
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel