Re: [FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread compn
On Thu, 23 Jun 2016 23:00:48 +0200
Michael Niedermayer  wrote:

> what shall FFmpeg 3.1 be called ?
> Are there other suggestions?

has fibonacci been used?

-compn
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread Wang Bin
What about choosing a scientist who was born in FFmpeg release month?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 05/10] diractab: expose the maximum quantization index as a macro

2016-06-23 Thread Michael Niedermayer
On Thu, Jun 23, 2016 at 06:06:59PM +0100, Rostislav Pehlivanov wrote:
> Prevents having to have random magic values in the decoder and a
> separate macro in the encoder.
> 
> Signed-off-by: Rostislav Pehlivanov 
> ---
>  libavcodec/diracdec.c | 8 
>  libavcodec/diractab.h | 2 ++
>  libavcodec/vc2enc.c   | 9 +++--
>  3 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
> index c8ab2df..48ba194 100644
> --- a/libavcodec/diracdec.c
> +++ b/libavcodec/diracdec.c
> @@ -486,7 +486,7 @@ static inline void codeblock(DiracContext *s, SubBand *b,
>  b->quant = quant;
>  }
>  
> -if (b->quant > 115) {
> +if (b->quant > DIRAC_MAX_QUANT_INDEX) {
>  av_log(s->avctx, AV_LOG_ERROR, "Unsupported quant %d\n", b->quant);
>  b->quant = 0;
>  return;
> @@ -676,12 +676,12 @@ static void decode_subband(DiracContext *s, 
> GetBitContext *gb, int quant,
>  uint8_t *buf2 = b2 ? b2->ibuf + top * b2->stride: NULL;
>  int x, y;
>  
> -if (quant > 115) {
> +if (quant > DIRAC_MAX_QUANT_INDEX) {
>  av_log(s->avctx, AV_LOG_ERROR, "Unsupported quant %d\n", quant);
>  return;
>  }

> -qfactor = ff_dirac_qscale_tab[quant & 0x7f];
> -qoffset = ff_dirac_qoffset_intra_tab[quant & 0x7f] + 2;
> +qfactor = ff_dirac_qscale_tab[quant & DIRAC_MAX_QUANT_INDEX];
> +qoffset = ff_dirac_qoffset_intra_tab[quant & DIRAC_MAX_QUANT_INDEX] + 2;

if iam not missing anything then
DIRAC_MAX_QUANT_INDEX / FF_ARRAY_ELEMS(ff_dirac_qscale_tab) is
116
quant & 116 looks unintended


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I know you won't believe me, but the highest form of Human Excellence is
to question oneself and others. -- Socrates


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] lavd/decklink_common: Fix error caused by -Werror=missing-prototypes

2016-06-23 Thread Richard Kern

> On Jun 23, 2016, at 8:06 PM, Carl Eugen Hoyos  wrote:
> 
> Rick Kern  gmail.com> writes:
> 
>> This temporarily disables the missing-prototypes error so 
>> the file can be included.
> 
> Can't you add -Wno-error=missing-prototypes to the cxx flags 
> just as you did in 1/2?

This was just disabling it for the bad include instead of every file. I’ll add 
it to the cxx flags - much cleaner anyway.

> 
> Thank you for working on this, I like 1/2, Carl Eugen
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] lavd/decklink_common: Fix error caused by -Werror=missing-prototypes

2016-06-23 Thread Carl Eugen Hoyos
Rick Kern  gmail.com> writes:

> This temporarily disables the missing-prototypes error so 
> the file can be included.

Can't you add -Wno-error=missing-prototypes to the cxx flags 
just as you did in 1/2?

Thank you for working on this, I like 1/2, Carl Eugen

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/2] lavd/decklink: Fix compile issue on OS X

2016-06-23 Thread Rick Kern
Fixes #4124: Invalid argument '-std=c99' not allowed with 'C++/ObjC++'
C++ files fail to compile. This adds '-std=c++11' to CXX_FLAGS to fix.

Signed-off-by: Rick Kern 
---
 common.mak | 2 +-
 configure  | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/common.mak b/common.mak
index 59b039f..3f2096d 100644
--- a/common.mak
+++ b/common.mak
@@ -39,7 +39,7 @@ CCFLAGS = $(CPPFLAGS) $(CFLAGS)
 OBJCFLAGS  += $(EOBJCFLAGS)
 OBJCCFLAGS  = $(CPPFLAGS) $(CFLAGS) $(OBJCFLAGS)
 ASFLAGS:= $(CPPFLAGS) $(ASFLAGS)
-CXXFLAGS   += $(CPPFLAGS) $(CFLAGS)
+CXXFLAGS   := $(CPPFLAGS) $(CFLAGS) $(CXXFLAGS)
 YASMFLAGS  += $(IFLAGS:%=%/) -Pconfig.asm
 
 HOSTCCFLAGS = $(IFLAGS) $(HOSTCPPFLAGS) $(HOSTCFLAGS)
diff --git a/configure b/configure
index 94a0a6c..3787894 100755
--- a/configure
+++ b/configure
@@ -4519,6 +4519,7 @@ fi
 
 add_cppflags -D_ISOC99_SOURCE
 add_cxxflags -D__STDC_CONSTANT_MACROS
+add_cxxflags -std=c++11
 check_cflags -std=c99
 check_cc -D_FILE_OFFSET_BITS=64 <
-- 
2.9.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/2] lavd/decklink_common: Fix error caused by -Werror=missing-prototypes

2016-06-23 Thread Rick Kern
decklink_common.cpp includes a .cpp file from the DeckLink API which fails
to build because there are non-static functions in the included .cpp file.
This temporarily disables the missing-prototypes error so the file can
be included.

Signed-off-by: Rick Kern 
---
 libavdevice/decklink_common.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/libavdevice/decklink_common.cpp b/libavdevice/decklink_common.cpp
index ac7964c..f4d4275 100644
--- a/libavdevice/decklink_common.cpp
+++ b/libavdevice/decklink_common.cpp
@@ -23,7 +23,25 @@
 #ifdef _WIN32
 #include 
 #else
+#include "libavutil/attributes.h"
+
+#if AV_GCC_VERSION_AT_LEAST(4, 6)
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wmissing-prototypes"
+#warning GCC
+#elif defined(__clang__)
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wmissing-prototypes"
+#endif
+
 #include 
+
+#if AV_GCC_VERSION_AT_LEAST(4, 6)
+#pragma GCC diagnostic pop
+#elif defined(__clang__)
+#pragma clang diagnostic pop
+#warning clang
+#endif
 #endif
 
 #include 
-- 
2.9.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 0/2] DeckLink build fixes

2016-06-23 Thread Rick Kern
This fixes two build issues on OS X when --enable-decklink.

Rick Kern (2):
  lavd/decklink: Fix compile issue on OS X
  lavd/decklink_common: Fix error caused by -Werror=missing-prototypes

 common.mak  |  2 +-
 configure   |  1 +
 libavdevice/decklink_common.cpp | 18 ++
 3 files changed, 20 insertions(+), 1 deletion(-)

-- 
2.9.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread Bodecs Bela



2016.06.23. 23:23 keltezéssel, Paul B Mahol írta:

On 6/23/16, Rostislav Pehlivanov  wrote:

On 23 June 2016 at 22:00, Michael Niedermayer 
wrote:


Hi all

what shall FFmpeg 3.1 be called ?

I still have these ideas from past suggestions:
Von Neumann, Einstein, lorentz, poincare, desitter, de broglie, Gauss,
Galois, Viterbi, Darwin

Are there other suggestions?
Is something preferred ?

In absence of any preferrance ill pick something randomly

Thanks

--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. --
Aristotle

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel



Laplace.

Carl
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Lorentz or Laplace

And a suggestion for later times:  Heaviside

https://en.wikipedia.org/wiki/Oliver_Heaviside

best,

Bela
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread Paul B Mahol
On 6/23/16, Rostislav Pehlivanov  wrote:
> On 23 June 2016 at 22:00, Michael Niedermayer 
> wrote:
>
>> Hi all
>>
>> what shall FFmpeg 3.1 be called ?
>>
>> I still have these ideas from past suggestions:
>> Von Neumann, Einstein, lorentz, poincare, desitter, de broglie, Gauss,
>> Galois, Viterbi, Darwin
>>
>> Are there other suggestions?
>> Is something preferred ?
>>
>> In absence of any preferrance ill pick something randomly
>>
>> Thanks
>>
>> --
>> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>>
>> No great genius has ever existed without some touch of madness. --
>> Aristotle
>>
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>>
> Laplace.

Carl
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread Rostislav Pehlivanov
On 23 June 2016 at 22:00, Michael Niedermayer 
wrote:

> Hi all
>
> what shall FFmpeg 3.1 be called ?
>
> I still have these ideas from past suggestions:
> Von Neumann, Einstein, lorentz, poincaré, desitter, de broglie, Gauss,
> Galois, Viterbi, Darwin
>
> Are there other suggestions?
> Is something preferred ?
>
> In absence of any preferrance ill pick something randomly
>
> Thanks
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> No great genius has ever existed without some touch of madness. --
> Aristotle
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Laplace.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] FFmpeg 3.1 name

2016-06-23 Thread Michael Niedermayer
Hi all

what shall FFmpeg 3.1 be called ?

I still have these ideas from past suggestions:
Von Neumann, Einstein, lorentz, poincaré, desitter, de broglie, Gauss, Galois, 
Viterbi, Darwin

Are there other suggestions?
Is something preferred ?

In absence of any preferrance ill pick something randomly

Thanks

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] h264: make H264ParamSets sps const

2016-06-23 Thread Michael Niedermayer
On Thu, Jun 23, 2016 at 03:28:10PM +0200, Benoit Fouet wrote:
> Hi,
> 
> 
> On 21/06/2016 16:42, Benoit Fouet wrote:
> >Hi,
> >
> >On 21/06/2016 16:29, Hendrik Leppkes wrote:
> >>On Tue, Jun 21, 2016 at 4:20 PM, Benoit Fouet
> >> wrote:
> >>>Hi,
> >>>
> >>>
> >>>On 21/06/2016 14:52, Hendrik Leppkes wrote:
> On Tue, Jun 21, 2016 at 2:40 PM, Clément Bœsch  wrote:
> >On Tue, Jun 21, 2016 at 02:34:33PM +0200, Benoit Fouet wrote:
> >>Hi,
> >>
> >>Unless I totally missed something, the FIXME in
> >>H264ParamSets structure
> >>should be fixed by attached patch.
> >>
> >>-- 
> >>Ben
> >>
> >>  From 28ae10498f81070539bdb8f40236326743350101 Mon Sep
> >>17 00:00:00 2001
> >>From: Benoit Fouet 
> >>Date: Tue, 21 Jun 2016 14:17:13 +0200
> >>Subject: [PATCH] h264: make H264ParamSets sps const
> >>
> >>---
> >>   libavcodec/h264.h   | 3 +--
> >>   libavcodec/h264_slice.c | 2 +-
> >>   2 files changed, 2 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/libavcodec/h264.h b/libavcodec/h264.h
> >>index c4d2921..b809ee5 100644
> >>--- a/libavcodec/h264.h
> >>+++ b/libavcodec/h264.h
> >>@@ -234,8 +234,7 @@ typedef struct H264ParamSets {
> >>   AVBufferRef *sps_ref;
> >>   /* currently active parameters sets */
> >>   const PPS *pps;
> >>-// FIXME this should properly be const
> >>-SPS *sps;
> >>+const SPS *sps;
> >>   } H264ParamSets;
> >>
> >>   /**
> >>diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c
> >>index 6e7b940..da7f9dd 100644
> >>--- a/libavcodec/h264_slice.c
> >>+++ b/libavcodec/h264_slice.c
> >>@@ -873,7 +873,7 @@ static enum AVPixelFormat
> >>get_pixel_format(H264Context *h, int force_callback)
> >>   /* export coded and cropped frame dimensions to AVCodecContext */
> >>   static int init_dimensions(H264Context *h)
> >>   {
> >>-SPS *sps = h->ps.sps;
> >>+SPS *sps = (SPS*)h->ps.sps_ref->data;
> >>   int width  = h->width  - (sps->crop_right + sps->crop_left);
> >>   int height = h->height - (sps->crop_top   +
> >>sps->crop_bottom);
> >>   av_assert0(sps->crop_right + sps->crop_left <
> >>(unsigned)h->width);
> >So it's not actually const, right?
> >
> Indeed, the FIXME wasn't just there because someone forgot to write
> "const" in front of it, but because it was used in some parts as
> not-const.
> >>>
> >>>OK, right... Thanks for reminding me of reading the code better before
> >>>sending a patch.
> >>>
> >>>As far as I can see, the only place where this constness is
> >>>not preserved is
> >>>in the init_dimensions function (in h264_slice), in a dead
> >>>part of the code,
> >>>as crop is asserted at the beginning of the very same function.
> >>>Please correct me if I've missed other places.
> >>>
> >>If anything the asserts should probably be removed, because bad files
> >>should never be able to trigger assertions, and the existing check
> >>remain.
> >
> >Well, the SPS "decoder" already takes care of the check (see
> >ff_h264_decode_seq_parameter_set).
> >So I could remove the check, because it seems useless, instead of
> >removing it because "bad things happen", what do you think?
> >
> 
> Any objection to this patch now?

iam ok with the patch, maybe give others a bit of time to reply
before applying though

[...]

thx


-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If a bugfix only changes things apparently unrelated to the bug with no
further explanation, that is a good sign that the bugfix is wrong.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 01/10] diracdsp: add SIMD for the 10 bit version of put_signed_rect_clamped

2016-06-23 Thread James Almer
On 6/23/2016 2:06 PM, Rostislav Pehlivanov wrote:
> Signed-off-by: Rostislav Pehlivanov 
> ---
>  libavcodec/x86/diracdsp.asm| 47 
> ++
>  libavcodec/x86/diracdsp_init.c |  6 ++
>  2 files changed, 53 insertions(+)
> 
> diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> index a042413..9db7b67 100644
> --- a/libavcodec/x86/diracdsp.asm
> +++ b/libavcodec/x86/diracdsp.asm
> @@ -22,6 +22,8 @@
>  
>  SECTION_RODATA
>  pw_7: times 8 dw 7
> +convert_to_unsigned_10bit: times 4 dd 0x200
> +clip_10bit:times 8 dw 0x3ff
>  
>  cextern pw_3
>  cextern pw_16
> @@ -172,6 +174,48 @@ cglobal put_signed_rect_clamped_%1, 5,9,3, dst, 
> dst_stride, src, src_stride, w,
>  RET
>  %endm
>  
> +%macro PUT_RECT_10 0
> +; void put_signed_rect_clamped_10(uint8_t *dst, int dst_stride, const 
> uint8_t *src, int src_stride, int width, int height)
> +cglobal put_signed_rect_clamped_10, 6, 9, 6, dst, dst_stride, src, 
> src_stride, w, h

This is x86_64 only. Either add the relevant pre-processor checks here
and to the init file, or make the necessary changes to make it work
on x86_32.
Look at the 8bit version of put_signed_rect_clamped for an example of
how to deal with this using stack.

> +
> +neg  wq
> +neg  hq

Why? You're not using these as part of effective addresses, just as
counters. Keep them as is and just do sub instead of add in the loops
below.
For that matter, you'd need to sign extend these with movsxd before
negating them, or change the prototype and make them ptrdiff_t instead
of int.

> +mov  r6, srcq
> +mov  r7, dstq
> +mov  r8, wq
> +pxor m2, m2
> +mova m3, [clip_10bit]
> +mova m4, [convert_to_unsigned_10bit]
> +
> +.loop_h:
> +mov  srcq, r6
> +mov  dstq, r7
> +mov  wq,   r8
> +
> +.loop_w:
> +movu m0, [srcq+0*mmsize]
> +movu m1, [srcq+1*mmsize]
> +
> +padddm0, m4
> +padddm1, m4
> +packusdw m0, m0, m1
> +CLIPWm0, m2, m3 ; packusdw saturates so it's fine

Would be nice if you could make this work with SSE2 as well.
There are some examples of packusdw SSE2 emulation in the codebase.

> +
> +movu [dstq], m0
> +
> +add  srcq, 2*mmsize
> +add  dstq, 1*mmsize
> +add  wq, 8
> +jl   .loop_w
> +
> +add  r6, src_strideq
> +add  r7, dst_strideq
> +add  hq, 1

Make sure to do "sub wd, 8" and "sub hd, 1" after removing the above
negs if don't change the prototype.

> +jl   .loop_h
> +
> +RET
> +%endm
> +
>  %macro ADD_RECT 1
>  ; void add_rect_clamped(uint8_t *dst, uint16_t *src, int stride, int16_t 
> *idwt, int idwt_stride, int width, int height)
>  cglobal add_rect_clamped_%1, 7,9,3, dst, src, stride, idwt, idwt_stride, w, h
> @@ -263,3 +307,6 @@ ADD_RECT sse2
>  HPEL_FILTER sse2
>  ADD_OBMC 32, sse2
>  ADD_OBMC 16, sse2
> +
> +INIT_XMM sse4
> +PUT_RECT_10

No need to make it a macro if it's going to be a single version.
If you add a SSE2 one then this would makes sense.

> diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
> index 5fae798..4786eea 100644
> --- a/libavcodec/x86/diracdsp_init.c
> +++ b/libavcodec/x86/diracdsp_init.c
> @@ -46,6 +46,8 @@ void ff_put_rect_clamped_sse2(uint8_t *dst, int dst_stride, 
> const int16_t *src,
>  void ff_put_signed_rect_clamped_mmx(uint8_t *dst, int dst_stride, const 
> int16_t *src, int src_stride, int width, int height);
>  void ff_put_signed_rect_clamped_sse2(uint8_t *dst, int dst_stride, const 
> int16_t *src, int src_stride, int width, int height);
>  
> +void ff_put_signed_rect_clamped_10_sse4(uint8_t *dst, int dst_stride, const 
> uint8_t *src, int src_stride, int width, int height);
> +
>  #if HAVE_YASM
>  
>  #define HPEL_FILTER(MMSIZE, EXT) 
> \
> @@ -184,4 +186,8 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
>  c->put_dirac_pixels_tab[2][0] = ff_put_dirac_pixels32_sse2;
>  c->avg_dirac_pixels_tab[2][0] = ff_avg_dirac_pixels32_sse2;
>  }
> +
> +if (EXTERNAL_SSE4(mm_flags)) {
> +c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4;
> +}
>  }
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] doc: add Libav merge document

2016-06-23 Thread Michael Niedermayer
On Thu, Jun 23, 2016 at 04:27:59PM +0200, Clément Bœsch wrote:
> From: Clément Bœsch 
> 
> ---
> Very incomplete, maybe splittable (split out the 3 first sections somewhere as
> an announce on the website)
> 

> Comments from other people who have done merges in the past very welcome,
> notably on the last 2 sections.

ill probably post some patches/suggestions to extend this once its
applied


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 02/10] diracdsp: add dequantization SIMD

2016-06-23 Thread James Almer
On 6/23/2016 2:06 PM, Rostislav Pehlivanov wrote:
> Currently unused, to be used in the following commits.
> 
> Signed-off-by: Rostislav Pehlivanov 
> ---
>  libavcodec/diracdsp.c  | 24 
>  libavcodec/diracdsp.h  |  4 
>  libavcodec/x86/diracdsp.asm| 41 +
>  libavcodec/x86/diracdsp_init.c |  4 +++-
>  4 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/diracdsp.c b/libavcodec/diracdsp.c
> index ab8d149..d0cfd00 100644
> --- a/libavcodec/diracdsp.c
> +++ b/libavcodec/diracdsp.c
> @@ -189,6 +189,27 @@ static void add_rect_clamped_c(uint8_t *dst, const 
> uint16_t *src, int stride,
>  }
>  }
>  
> +#define DEQUANT_SUBBAND(PX)  
>   \
> +static void dequant_subband_ ## PX ## _c(uint8_t *src, uint8_t *dst, 
> ptrdiff_t stride, \
> + const int qf, const int qs, int64_t 
> tot_v, int64_t tot_h) \

Shouldn't this be int (or ptrdiff_t)? Seeing they are int in the
SliceCoeffs struct introduced by patch 6, i don't see why they
should be int64_t here. Unless I'm missing something.

> +{
>   \
> +int i, y;
>   \
> +for (y = 0; y < tot_v; y++) {
>   \
> +PX c, sign, *src_r = (PX *)src, *dst_r = (PX *)dst;  
>   \
> +for (i = 0; i < tot_h; i++) {
>   \
> +c = *src_r++;
>   \
> +sign = FFSIGN(c)*(!!c);  
>   \
> +c = (FFABS(c)*qf + qs) >> 2; 
>   \
> +*dst_r++ = c*sign;   
>   \
> +}
>   \
> +src += tot_h << (sizeof(PX) >> 1);   
>   \
> +dst += stride;   
>   \
> +}
>   \
> +}
> +
> +DEQUANT_SUBBAND(int16_t)
> +DEQUANT_SUBBAND(int32_t)
> +
>  #define PIXFUNC(PFX, WIDTH) \
>  c->PFX ## _dirac_pixels_tab[WIDTH>>4][0] = ff_ ## PFX ## _dirac_pixels 
> ## WIDTH ## _c; \
>  c->PFX ## _dirac_pixels_tab[WIDTH>>4][1] = ff_ ## PFX ## _dirac_pixels 
> ## WIDTH ## _l2_c; \
> @@ -214,6 +235,9 @@ av_cold void ff_diracdsp_init(DiracDSPContext *c)
>  c->biweight_dirac_pixels_tab[1] = biweight_dirac_pixels16_c;
>  c->biweight_dirac_pixels_tab[2] = biweight_dirac_pixels32_c;
>  
> +c->dequant_subband[0] = c->dequant_subband[2] = 
> dequant_subband_int16_t_c;
> +c->dequant_subband[1] = c->dequant_subband[3] = 
> dequant_subband_int32_t_c;
> +
>  PIXFUNC(put, 8);
>  PIXFUNC(put, 16);
>  PIXFUNC(put, 32);
> diff --git a/libavcodec/diracdsp.h b/libavcodec/diracdsp.h
> index 25a872d..c0ac56b 100644
> --- a/libavcodec/diracdsp.h
> +++ b/libavcodec/diracdsp.h
> @@ -22,6 +22,7 @@
>  #define AVCODEC_DIRACDSP_H
>  
>  #include 
> +#include 
>  
>  typedef void (*dirac_weight_func)(uint8_t *block, int stride, int 
> log2_denom, int weight, int h);
>  typedef void (*dirac_biweight_func)(uint8_t *dst, const uint8_t *src, int 
> stride, int log2_denom, int weightd, int weights, int h);
> @@ -46,6 +47,9 @@ typedef struct {
>  void (*add_rect_clamped)(uint8_t *dst/*align 16*/, const uint16_t 
> *src/*align 16*/, int stride, const int16_t *idwt/*align 16*/, int 
> idwt_stride, int width, int height/*mod 2*/);
>  void (*add_dirac_obmc[3])(uint16_t *dst, const uint8_t *src, int stride, 
> const uint8_t *obmc_weight, int yblen);
>  
> +/* 0-1: int16_t and int32_t asm/c, 2-3: int16 and int32_t, C only */
> +void (*dequant_subband[4])(uint8_t *src, uint8_t *dst, ptrdiff_t stride, 
> const int qf, const int qs, int64_t tot_v, int64_t tot_h);
> +
>  dirac_weight_func weight_dirac_pixels_tab[3];
>  dirac_biweight_func biweight_dirac_pixels_tab[3];
>  } DiracDSPContext;
> diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> index 9db7b67..f743363 100644
> --- a/libavcodec/x86/diracdsp.asm
> +++ b/libavcodec/x86/diracdsp.asm
> @@ -289,6 +289,46 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, 
> obmc, yblen
>  RET
>  %endm
>  
> +%macro DEQUANT_SUBBAND_32 0
> +; void dequant_subband_32(uint8_t *src, 

[FFmpeg-devel] [PATCH 10/10] diracdec: do not memset the entire coefficient buffer for HQ pictures

2016-06-23 Thread Rostislav Pehlivanov
This is now handled by the slice decoding function.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index ec45132..a9af5ff 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -1891,9 +1891,11 @@ static int dirac_decode_frame_internal(DiracContext *s)
 
 if (s->low_delay) {
 /* [DIRAC_STD] 13.5.1 low_delay_transform_data() */
-for (comp = 0; comp < 3; comp++) {
-Plane *p = >plane[comp];
-memset(p->idwt.buf, 0, p->idwt.stride * p->idwt.height);
+if (!s->hq_picture) {
+for (comp = 0; comp < 3; comp++) {
+Plane *p = >plane[comp];
+memset(p->idwt.buf, 0, p->idwt.stride * p->idwt.height);
+}
 }
 if (!s->zero_res) {
 if ((ret = decode_lowdelay(s)) < 0)
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 09/10] diracdec: run the final decoding stage/idwt for every plane in parallel

2016-06-23 Thread Rostislav Pehlivanov
27% performance increase for a 12bit 4k file.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 152 ++
 1 file changed, 80 insertions(+), 72 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index 63eb4d1..ec45132 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -1804,99 +1804,107 @@ static int interpolate_refplane(DiracContext *s, 
DiracFrame *ref, int plane, int
 return 0;
 }
 
-/**
- * Dirac Specification ->
- * 13.0 Transform data syntax. transform_data()
- */
-static int dirac_decode_frame_internal(DiracContext *s)
+static int decode_plane(AVCodecContext *avctx, void *arg, int jobnr, int 
thread)
 {
 DWTContext d;
-int y, i, comp, dsty;
-int ret;
+int i, y, ret, dsty;
+DiracContext *s = avctx->priv_data;
+Plane *p= >plane[jobnr];
+uint8_t *frame  = s->current_picture->avframe->data[jobnr];
 
-if (s->low_delay) {
-/* [DIRAC_STD] 13.5.1 low_delay_transform_data() */
-for (comp = 0; comp < 3; comp++) {
-Plane *p = >plane[comp];
-memset(p->idwt.buf, 0, p->idwt.stride * p->idwt.height);
-}
-if (!s->zero_res) {
-if ((ret = decode_lowdelay(s)) < 0)
-return ret;
-}
+/* FIXME: small resolutions */
+for (i = 0; i < 4; i++)
+s->edge_emu_buffer[i] = s->edge_emu_buffer_base + i*FFALIGN(p->width, 
16);
+
+if (!s->zero_res && !s->low_delay)
+{
+memset(p->idwt.buf, 0, p->idwt.stride * p->idwt.height);
+decode_component(s, jobnr); /* [DIRAC_STD] 13.4.1 
core_transform_data() */
 }
+ret = ff_spatial_idwt_init(, >idwt, s->wavelet_idx+2,
+   s->wavelet_depth, s->bit_depth);
+if (ret < 0)
+return ret;
 
-for (comp = 0; comp < 3; comp++) {
-Plane *p   = >plane[comp];
-uint8_t *frame = s->current_picture->avframe->data[comp];
+if (!s->num_refs) { /* intra */
+for (y = 0; y < p->height; y += 16) {
+int idx = (s->bit_depth - 8) >> 1;
+ff_spatial_idwt_slice2(, y+16); /* decode */
+s->diracdsp.put_signed_rect_clamped[idx](frame + y*p->stride,
+ p->stride,
+ p->idwt.buf + 
y*p->idwt.stride,
+ p->idwt.stride, p->width, 
16);
+}
+} else { /* inter */
+int rowheight = p->ybsep*p->stride;
 
-/* FIXME: small resolutions */
-for (i = 0; i < 4; i++)
-s->edge_emu_buffer[i] = s->edge_emu_buffer_base + 
i*FFALIGN(p->width, 16);
+select_dsp_funcs(s, p->width, p->height, p->xblen, p->yblen);
 
-if (!s->zero_res && !s->low_delay)
-{
-memset(p->idwt.buf, 0, p->idwt.stride * p->idwt.height);
-decode_component(s, comp); /* [DIRAC_STD] 13.4.1 
core_transform_data() */
+for (i = 0; i < s->num_refs; i++) {
+int ret = interpolate_refplane(s, s->ref_pics[i], jobnr, p->width, 
p->height);
+if (ret < 0)
+return ret;
 }
-ret = ff_spatial_idwt_init(, >idwt, s->wavelet_idx+2,
-   s->wavelet_depth, s->bit_depth);
-if (ret < 0)
-return ret;
 
-if (!s->num_refs) { /* intra */
-for (y = 0; y < p->height; y += 16) {
-int idx = (s->bit_depth - 8) >> 1;
-ff_spatial_idwt_slice2(, y+16); /* decode */
-s->diracdsp.put_signed_rect_clamped[idx](frame + y*p->stride,
- p->stride,
- p->idwt.buf + 
y*p->idwt.stride,
- p->idwt.stride, 
p->width, 16);
-}
-} else { /* inter */
-int rowheight = p->ybsep*p->stride;
+memset(s->mctmp, 0, 4*p->yoffset*p->stride);
 
-select_dsp_funcs(s, p->width, p->height, p->xblen, p->yblen);
+dsty = -p->yoffset;
+for (y = 0; y < s->blheight; y++) {
+int h = 0,
+start = FFMAX(dsty, 0);
+uint16_t *mctmp= s->mctmp + y*rowheight;
+DiracBlock *blocks = s->blmotion + y*s->blwidth;
 
-for (i = 0; i < s->num_refs; i++) {
-int ret = interpolate_refplane(s, s->ref_pics[i], comp, 
p->width, p->height);
-if (ret < 0)
-return ret;
-}
+init_obmc_weights(s, p, y);
 
-memset(s->mctmp, 0, 4*p->yoffset*p->stride);
+if (y == s->blheight-1 || start+p->ybsep > p->height)
+h = p->height - start;
+else
+h = p->ybsep - (start - dsty);
+if (h < 0)
+break;

[FFmpeg-devel] [PATCH 08/10] diracdec: do not allocate and free slice parameters every frame

2016-06-23 Thread Rostislav Pehlivanov
Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 36 ++--
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index 9256777..63eb4d1 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -121,6 +121,14 @@ typedef struct Plane {
 SubBand band[MAX_DWT_LEVELS][4];
 } Plane;
 
+/* Used by Low Delay and High Quality profiles */
+typedef struct DiracSlice {
+GetBitContext gb;
+int slice_x;
+int slice_y;
+int bytes;
+} DiracSlice;
+
 typedef struct DiracContext {
 AVCodecContext *avctx;
 MpegvideoEncDSPContext mpvencdsp;
@@ -167,6 +175,9 @@ typedef struct DiracContext {
 int threads_num_buf; /* Current # of buffers allocated
*/
 int thread_buf_size; /* Each thread has a buffer this size
*/
 
+DiracSlice *slice_params_buf;
+int slice_params_num_buf;
+
 struct {
 unsigned width;
 unsigned height;
@@ -417,6 +428,7 @@ static av_cold int dirac_decode_end(AVCodecContext *avctx)
 av_frame_free(>all_frames[i].avframe);
 
 av_freep(>thread_buf);
+av_freep(>slice_params_buf);
 
 return 0;
 }
@@ -724,15 +736,6 @@ static void decode_subband(DiracContext *s, GetBitContext 
*gb, int quant,
 }
 }
 
-/* Used by Low Delay and High Quality profiles */
-typedef struct DiracSlice {
-GetBitContext gb;
-int slice_x;
-int slice_y;
-int bytes;
-} DiracSlice;
-
-
 /**
  * Dirac Specification ->
  * 13.5.2 Slices. slice(sx,sy)
@@ -904,9 +907,15 @@ static int decode_lowdelay(DiracContext *s)
 SliceCoeffs tmp[MAX_DWT_LEVELS];
 int slice_num = 0;
 
-slices = av_mallocz_array(s->num_x, s->num_y * sizeof(DiracSlice));
-if (!slices)
-return AVERROR(ENOMEM);
+if (s->slice_params_num_buf != (s->num_x * s->num_y)) {
+s->slice_params_buf = av_realloc_f(s->thread_buf, s->num_x * s->num_y, 
sizeof(DiracSlice));
+if (!s->slice_params_buf) {
+av_log(s->avctx, AV_LOG_ERROR, "slice params buffer allocation 
failure\n");
+return AVERROR(ENOMEM);
+}
+s->slice_params_num_buf = s->num_x * s->num_y;
+}
+slices = s->slice_params_buf;
 
 /* 8 becacuse that's how much the golomb reader could overread junk data
  * from another plane/slice at most, and 512 because SIMD */
@@ -941,7 +950,6 @@ static int decode_lowdelay(DiracContext *s)
 }
 if (bytes >= INT_MAX || bytes*8 > bufsize) {
 av_log(s->avctx, AV_LOG_ERROR, "too many bytes\n");
-av_free(slices);
 return AVERROR_INVALIDDATA;
 }
 
@@ -998,7 +1006,7 @@ static int decode_lowdelay(DiracContext *s)
 intra_dc_prediction_8(>plane[2].band[0][0]);
 }
 }
-av_free(slices);
+
 return 0;
 }
 
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 05/10] diractab: expose the maximum quantization index as a macro

2016-06-23 Thread Rostislav Pehlivanov
Prevents having to have random magic values in the decoder and a
separate macro in the encoder.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 8 
 libavcodec/diractab.h | 2 ++
 libavcodec/vc2enc.c   | 9 +++--
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index c8ab2df..48ba194 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -486,7 +486,7 @@ static inline void codeblock(DiracContext *s, SubBand *b,
 b->quant = quant;
 }
 
-if (b->quant > 115) {
+if (b->quant > DIRAC_MAX_QUANT_INDEX) {
 av_log(s->avctx, AV_LOG_ERROR, "Unsupported quant %d\n", b->quant);
 b->quant = 0;
 return;
@@ -676,12 +676,12 @@ static void decode_subband(DiracContext *s, GetBitContext 
*gb, int quant,
 uint8_t *buf2 = b2 ? b2->ibuf + top * b2->stride: NULL;
 int x, y;
 
-if (quant > 115) {
+if (quant > DIRAC_MAX_QUANT_INDEX) {
 av_log(s->avctx, AV_LOG_ERROR, "Unsupported quant %d\n", quant);
 return;
 }
-qfactor = ff_dirac_qscale_tab[quant & 0x7f];
-qoffset = ff_dirac_qoffset_intra_tab[quant & 0x7f] + 2;
+qfactor = ff_dirac_qscale_tab[quant & DIRAC_MAX_QUANT_INDEX];
+qoffset = ff_dirac_qoffset_intra_tab[quant & DIRAC_MAX_QUANT_INDEX] + 2;
 /* we have to constantly check for overread since the spec explicitly
requires this, with the meaning that all remaining coeffs are set to 0 
*/
 if (get_bits_count(gb) >= bits_end)
diff --git a/libavcodec/diractab.h b/libavcodec/diractab.h
index cd8b8ac..2423b07 100644
--- a/libavcodec/diractab.h
+++ b/libavcodec/diractab.h
@@ -38,4 +38,6 @@ extern const int32_t ff_dirac_qoffset_intra_tab[120];
 /* Scaling offsets needed for quantization/dequantization, for inter frames */
 extern const int ff_dirac_qoffset_inter_tab[122];
 
+#define DIRAC_MAX_QUANT_INDEX (FF_ARRAY_ELEMS(ff_dirac_qscale_tab))
+
 #endif /* AVCODEC_DIRACTAB_H */
diff --git a/libavcodec/vc2enc.c b/libavcodec/vc2enc.c
index bbbeaa0..eda3901 100644
--- a/libavcodec/vc2enc.c
+++ b/libavcodec/vc2enc.c
@@ -29,11 +29,8 @@
 #include "vc2enc_dwt.h"
 #include "diractab.h"
 
-/* Quantizations above this usually zero coefficients and lower the quality */
-#define MAX_QUANT_INDEX FF_ARRAY_ELEMS(ff_dirac_qscale_tab)
-
 /* Total range is -COEF_LUT_TAB to +COEFF_LUT_TAB, but total tab size is half
- * (COEF_LUT_TAB*MAX_QUANT_INDEX) since the sign is appended during encoding */
+ * (COEF_LUT_TAB*DIRAC_MAX_QUANT_INDEX), as the sign is appended during 
encoding */
 #define COEF_LUT_TAB 2048
 
 /* The limited size resolution of each slice forces us to do this */
@@ -109,7 +106,7 @@ typedef struct Plane {
 
 typedef struct SliceArgs {
 PutBitContext pb;
-int cache[MAX_QUANT_INDEX];
+int cache[DIRAC_MAX_QUANT_INDEX];
 void *ctx;
 int x;
 int y;
@@ -1074,7 +1071,7 @@ static av_cold int vc2_encode_init(AVCodecContext *avctx)
 s->picture_number = 0;
 
 /* Total allowed quantization range */
-s->q_ceil= MAX_QUANT_INDEX;
+s->q_ceil= DIRAC_MAX_QUANT_INDEX;
 
 s->ver.major = 2;
 s->ver.minor = 0;
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 03/10] diracdec: simplify golomb parsing and dequantization

2016-06-23 Thread Rostislav Pehlivanov
In preparation for the following commits, this commit simplifies the
coefficient parsing and dequantization function. It was needlessly
inlined without much performance gain.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 53 ++-
 1 file changed, 6 insertions(+), 47 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index 1d7bb9b..b2008c5 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -406,58 +406,17 @@ static av_cold int dirac_decode_end(AVCodecContext *avctx)
 return 0;
 }
 
-#define SIGN_CTX(x) (CTX_SIGN_ZERO + ((x) > 0) - ((x) < 0))
-
 static inline int coeff_unpack_golomb(GetBitContext *gb, int qfactor, int 
qoffset)
 {
-int sign, coeff;
-uint32_t buf;
-
-OPEN_READER(re, gb);
-UPDATE_CACHE(re, gb);
-buf = GET_CACHE(re, gb);
-
-if (buf & 0x8000) {
-LAST_SKIP_BITS(re,gb,1);
-CLOSE_READER(re, gb);
-return 0;
-}
-
-if (buf & 0xAA80) {
-buf >>= 32 - 8;
-SKIP_BITS(re, gb, ff_interleaved_golomb_vlc_len[buf]);
-
-coeff = ff_interleaved_ue_golomb_vlc_code[buf];
-} else {
-unsigned ret = 1;
-
-do {
-buf >>= 32 - 8;
-SKIP_BITS(re, gb,
-   FFMIN(ff_interleaved_golomb_vlc_len[buf], 8));
-
-if (ff_interleaved_golomb_vlc_len[buf] != 9) {
-ret <<= (ff_interleaved_golomb_vlc_len[buf] - 1) >> 1;
-ret  |= ff_interleaved_dirac_golomb_vlc_code[buf];
-break;
-}
-ret = (ret << 4) | ff_interleaved_dirac_golomb_vlc_code[buf];
-UPDATE_CACHE(re, gb);
-buf = GET_CACHE(re, gb);
-} while (ret<0x800U && BITS_AVAILABLE(re, gb));
-
-coeff = ret - 1;
-}
-
-coeff = (coeff * qfactor + qoffset) >> 2;
-sign  = SHOW_SBITS(re, gb, 1);
-LAST_SKIP_BITS(re, gb, 1);
-coeff = (coeff ^ sign) - sign;
-
-CLOSE_READER(re, gb);
+int coeff = dirac_get_se_golomb(gb);
+const int sign = FFSIGN(coeff);
+if (coeff)
+coeff = sign*((sign * coeff * qfactor + qoffset) >> 2);
 return coeff;
 }
 
+#define SIGN_CTX(x) (CTX_SIGN_ZERO + ((x) > 0) - ((x) < 0))
+
 #define UNPACK_ARITH(n, type) \
 static inline void coeff_unpack_arith_##n(DiracArith *c, int qfactor, int 
qoffset, \
   SubBand *b, type *buf, int x, 
int y) \
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 07/10] diracdec: implement a LUT-based Golomb code parser

2016-06-23 Thread Rostislav Pehlivanov
Still much left to optimize, but it provides a significant performance
improvement - 10% for 300Mbps (1080p30), 25% for 1.5Gbps (4k 60fps) in
comparison with the default implementation.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/Makefile|   3 +-
 libavcodec/dirac_vlc.c | 242 +
 libavcodec/dirac_vlc.h |  51 +++
 libavcodec/diracdec.c  |  25 ++---
 4 files changed, 308 insertions(+), 13 deletions(-)
 create mode 100644 libavcodec/dirac_vlc.c
 create mode 100644 libavcodec/dirac_vlc.h

diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 7c3aa69..833dc35 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -233,7 +233,8 @@ OBJS-$(CONFIG_DCA_DECODER) += dcadec.o dca.o 
dcadata.o dcahuff.o \
 OBJS-$(CONFIG_DCA_ENCODER) += dcaenc.o dca.o dcadata.o
 OBJS-$(CONFIG_DDS_DECODER) += dds.o
 OBJS-$(CONFIG_DIRAC_DECODER)   += diracdec.o dirac.o diracdsp.o 
diractab.o \
-  dirac_arith.o mpeg12data.o 
dirac_dwt.o
+  dirac_arith.o mpeg12data.o 
dirac_dwt.o \
+  dirac_vlc.o
 OBJS-$(CONFIG_DFA_DECODER) += dfa.o
 OBJS-$(CONFIG_DNXHD_DECODER)   += dnxhddec.o dnxhddata.o
 OBJS-$(CONFIG_DNXHD_ENCODER)   += dnxhdenc.o dnxhddata.o
diff --git a/libavcodec/dirac_vlc.c b/libavcodec/dirac_vlc.c
new file mode 100644
index 000..4de22a0
--- /dev/null
+++ b/libavcodec/dirac_vlc.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (C) 2016 Open Broadcast Systems Ltd.
+ * Author2016 Rostislav Pehlivanov 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "dirac_vlc.h"
+
+#define LUT_SIZE   (1 << LUT_BITS)
+#define RSIZE_BITS (CHAR_BIT*sizeof(residual))
+
+#define CONVERT_TO_RESIDUE(a, b)   
\
+(((residual)(a)) << (RSIZE_BITS - (b)))
+
+#define INIT_RESIDUE(N, I, B)  
\
+residual N = B ? CONVERT_TO_RESIDUE(I, B) : 0; 
\
+av_unused int32_t N ## _bits  = B
+
+int ff_dirac_golomb_read_32bit(DiracGolombLUT *lut_ctx, const uint8_t *buf,
+   int bytes, uint8_t *_dst, int coeffs)
+{
+int i, b, c_idx = 0;
+int32_t *dst = (int32_t *)_dst;
+DiracGolombLUT *future[4], *l = _ctx[2*LUT_SIZE + buf[0]];
+INIT_RESIDUE(res, 0, 0);
+
+#define APPEND_RESIDUE(N, M) \
+N  |= M >> (N ## _bits); \
+N ## _bits +=  (M ## _bits)
+
+for (b = 1; b <= bytes; b++) {
+future[0] = _ctx[buf[b]];
+future[1] = future[0] + 1*LUT_SIZE;
+future[2] = future[0] + 2*LUT_SIZE;
+future[3] = future[0] + 3*LUT_SIZE;
+
+if ((c_idx + 1) > coeffs)
+return c_idx;
+
+/* res_bits is a hint for better branch prediction */
+if (res_bits && l->sign) {
+int32_t coeff = 1;
+APPEND_RESIDUE(res, l->preamble);
+for (i = 0; i < (res_bits >> 1) - 1; i++) {
+coeff <<= 1;
+coeff |= (res >> (RSIZE_BITS - 2*i - 2)) & 1;
+}
+dst[c_idx++] = l->sign * (coeff - 1);
+res_bits = res = 0;
+}
+
+memcpy([c_idx], l->ready, LUT_BITS*sizeof(int32_t));
+c_idx += l->ready_num;
+
+APPEND_RESIDUE(res, l->leftover);
+
+l = future[l->need_s ? 3 : !res_bits ? 2 : res_bits & 1];
+}
+
+return c_idx;
+}
+
+int ff_dirac_golomb_read_16bit(DiracGolombLUT *lut_ctx, const uint8_t *buf,
+   int bytes, uint8_t *_dst, int coeffs)
+{
+int i, b, c_idx = 0;
+int16_t *dst = (int16_t *)_dst;
+DiracGolombLUT *future[4], *l = _ctx[2*LUT_SIZE + buf[0]];
+INIT_RESIDUE(res, 0, 0);
+
+#define APPEND_RESIDUE(N, M) \
+N  |= M >> (N ## _bits); \
+N ## _bits +=  (M ## _bits)
+
+for (b = 1; b <= bytes; b++) {
+future[0] = _ctx[buf[b]];
+future[1] = future[0] + 1*LUT_SIZE;
+future[2] = future[0] + 2*LUT_SIZE;
+future[3] = future[0] + 3*LUT_SIZE;
+
+if ((c_idx + 1) > coeffs)
+ 

[FFmpeg-devel] [PATCH 04/10] diracdec: decode HQ profile slices in rows

2016-06-23 Thread Rostislav Pehlivanov
Siginificantly improves the performance.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index b2008c5..c8ab2df 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -806,6 +806,16 @@ static int decode_hq_slice(AVCodecContext *avctx, void 
*arg)
 return 0;
 }
 
+static int decode_hq_slice_row(AVCodecContext *avctx, void *arg, int jobnr, 
int threadnr)
+{
+int i;
+DiracContext *s = avctx->priv_data;
+DiracSlice *slices = ((DiracSlice *)arg) + s->num_x*jobnr;
+for (i = 0; i < s->num_x; i++)
+decode_hq_slice(avctx, [i]);
+return 0;
+}
+
 /**
  * Dirac Specification ->
  * 13.5.1 low_delay_transform_data()
@@ -857,8 +867,7 @@ static int decode_lowdelay(DiracContext *s)
 bufsize = 0;
 }
 }
-avctx->execute(avctx, decode_hq_slice, slices, NULL, slice_num,
-   sizeof(DiracSlice));
+avctx->execute2(avctx, decode_hq_slice_row, slices, NULL, s->num_y);
 } else {
 for (slice_y = 0; bufsize > 0 && slice_y < s->num_y; slice_y++) {
 for (slice_x = 0; bufsize > 0 && slice_x < s->num_x; slice_x++) {
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 06/10] diracdec: rewrite HQ slice decoding

2016-06-23 Thread Rostislav Pehlivanov
Now coefficients are written to a buffer and are then dequantized by the
new SIMD dequantization functions. For the lower bands without enough
coefficients to fill a register (and hence they overwrite) the C version
of the dequantization function is used.

The buffer is per-thread and will be realloc'd if anything changes.
This prevents regressions and having to limit slice size.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdec.c | 126 --
 1 file changed, 111 insertions(+), 15 deletions(-)

diff --git a/libavcodec/diracdec.c b/libavcodec/diracdec.c
index 48ba194..14fa3eb 100644
--- a/libavcodec/diracdec.c
+++ b/libavcodec/diracdec.c
@@ -161,6 +161,10 @@ typedef struct DiracContext {
 unsigned num_x;  /* number of horizontal slices   
*/
 unsigned num_y;  /* number of vertical slices 
*/
 
+uint8_t *thread_buf; /* Per-thread buffer for coefficient storage 
*/
+int threads_num_buf; /* Current # of buffers allocated
*/
+int thread_buf_size; /* Each thread has a buffer this size
*/
+
 struct {
 unsigned width;
 unsigned height;
@@ -370,6 +374,10 @@ static av_cold int dirac_decode_init(AVCodecContext *avctx)
 s->avctx = avctx;
 s->frame_number = -1;
 
+s->thread_buf = NULL;
+s->threads_num_buf = -1;
+s->thread_buf_size = -1;
+
 ff_diracdsp_init(>diracdsp);
 ff_mpegvideoencdsp_init(>mpvencdsp, avctx);
 ff_videodsp_init(>vdsp, 8);
@@ -403,6 +411,8 @@ static av_cold int dirac_decode_end(AVCodecContext *avctx)
 for (i = 0; i < MAX_FRAMES; i++)
 av_frame_free(>all_frames[i].avframe);
 
+av_freep(>thread_buf);
+
 return 0;
 }
 
@@ -760,46 +770,108 @@ static int decode_lowdelay_slice(AVCodecContext *avctx, 
void *arg)
 return 0;
 }
 
+typedef struct SliceCoeffs {
+int left;
+int top;
+int tot_h;
+int tot_v;
+int tot;
+} SliceCoeffs;
+
+static int subband_coeffs(DiracContext *s, int x, int y, int p,
+  SliceCoeffs c[MAX_DWT_LEVELS])
+{
+int level, coef = 0;
+for (level = 0; level < s->wavelet_depth; level++) {
+SliceCoeffs *o = [level];
+SubBand *b = >plane[p].band[level][3]; /* orientation doens't 
matter */
+o->top   = b->height * y / s->num_y;
+o->left  = b->width  * x / s->num_x;
+o->tot_h = ((b->width  * (x + 1)) / s->num_x) - o->left;
+o->tot_v = ((b->height * (y + 1)) / s->num_y) - o->top;
+o->tot   = o->tot_h*o->tot_v;
+coef+= o->tot * (4 - !!level);
+}
+return coef;
+}
+
 /**
  * VC-2 Specification ->
  * 13.5.3 hq_slice(sx,sy)
  */
-static int decode_hq_slice(AVCodecContext *avctx, void *arg)
+static int decode_hq_slice(DiracContext *s, DiracSlice *slice, uint8_t 
*tmp_buf)
 {
-int i, quant, level, orientation, quant_idx;
-uint8_t quants[MAX_DWT_LEVELS][4];
-DiracContext *s = avctx->priv_data;
-DiracSlice *slice = arg;
+int i, level, orientation, quant_idx;
+int qfactor[MAX_DWT_LEVELS][4], qoffset[MAX_DWT_LEVELS][4];
 GetBitContext *gb = >gb;
+SliceCoeffs coeffs_num[MAX_DWT_LEVELS];
 
 skip_bits_long(gb, 8*s->highquality.prefix_bytes);
 quant_idx = get_bits(gb, 8);
 
+if (quant_idx > DIRAC_MAX_QUANT_INDEX) {
+av_log(s->avctx, AV_LOG_ERROR, "Invalid quantization index - %i\n", 
quant_idx);
+return AVERROR_INVALIDDATA;
+}
+
 /* Slice quantization (slice_quantizers() in the specs) */
 for (level = 0; level < s->wavelet_depth; level++) {
 for (orientation = !!level; orientation < 4; orientation++) {
-quant = FFMAX(quant_idx - s->lowdelay.quant[level][orientation], 
0);
-quants[level][orientation] = quant;
+const int quant = FFMAX(quant_idx - 
s->lowdelay.quant[level][orientation], 0);
+qfactor[level][orientation] = ff_dirac_qscale_tab[quant];
+qoffset[level][orientation] = ff_dirac_qoffset_intra_tab[quant] + 
2;
 }
 }
 
 /* Luma + 2 Chroma planes */
 for (i = 0; i < 3; i++) {
-int64_t length = s->highquality.size_scaler * get_bits(gb, 8);
-int64_t bits_left = 8 * length;
-int64_t bits_end = get_bits_count(gb) + bits_left;
+int c, coef_num, coef_par, off = 0;
+int64_t length = s->highquality.size_scaler*get_bits(gb, 8);
+int64_t start = get_bits_count(gb);
+int64_t bits_end = start + 8*length;
 
 if (bits_end >= INT_MAX) {
 av_log(s->avctx, AV_LOG_ERROR, "end too far away\n");
 return AVERROR_INVALIDDATA;
 }
 
+coef_num = subband_coeffs(s, slice->slice_x, slice->slice_y, i, 
coeffs_num);
+
+if (s->pshift) {
+int32_t *dst = (int32_t *)tmp_buf;
+for (c = 0; c < coef_num; c++)
+dst[c] = dirac_get_se_golomb(gb);
+

[FFmpeg-devel] [PATCH 00/10] Dirac decoder improvements for the HQ profile

2016-06-23 Thread Rostislav Pehlivanov
This set of commits significantly improves the stability and performance
of the decoder, both with other profiles and the VC-2 HQ profile.

Suggestions on how to improve the performance of the VLC parser are highly
apperciated since it's the biggest bottleneck so far. Another bottleneck
is the lack of SIMD for iDWT for over 8 bit depths.

All of the changes were ported over from this branch which specialized the
Dirac decoder for the HQ profile:
https://github.com/atomnuker/ffmpeg_dirac_trimmed

The only feature missing from the current FFmpeg decoder is the lack of support
for field coded interlaced content. This was the reason why that decoder exists,
it was too messy and interfered with the reference frames code. Suggestions or
patches on how to bring that to the current decoder would be nice.
Other than that, that decoder is still more stable as it's far simpler to fuzz.

Rostislav Pehlivanov (10):
  diracdsp: add SIMD for the 10 bit version of put_signed_rect_clamped
  diracdsp: add dequantization SIMD
  diracdec: simplify golomb parsing and dequantization
  diracdec: decode HQ profile slices in rows
  diractab: expose the maximum quantization index as a macro
  diracdec: rewrite HQ slice decoding
  diracdec: implement a LUT-based Golomb code parser
  diracdec: do not allocate and free slice parameters every frame
  diracdec: run the final decoding stage/idwt for every plane in
parallel
  diracdec: do not memset the entire coefficient buffer for HQ pictures

 libavcodec/Makefile|   3 +-
 libavcodec/dirac_vlc.c | 242 ++
 libavcodec/dirac_vlc.h |  51 ++
 libavcodec/diracdec.c  | 387 +
 libavcodec/diracdsp.c  |  24 +++
 libavcodec/diracdsp.h  |   4 +
 libavcodec/diractab.h  |   2 +
 libavcodec/vc2enc.c|   9 +-
 libavcodec/x86/diracdsp.asm|  88 ++
 libavcodec/x86/diracdsp_init.c |   8 +
 10 files changed, 659 insertions(+), 159 deletions(-)
 create mode 100644 libavcodec/dirac_vlc.c
 create mode 100644 libavcodec/dirac_vlc.h

-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 01/10] diracdsp: add SIMD for the 10 bit version of put_signed_rect_clamped

2016-06-23 Thread Rostislav Pehlivanov
Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/x86/diracdsp.asm| 47 ++
 libavcodec/x86/diracdsp_init.c |  6 ++
 2 files changed, 53 insertions(+)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index a042413..9db7b67 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -22,6 +22,8 @@
 
 SECTION_RODATA
 pw_7: times 8 dw 7
+convert_to_unsigned_10bit: times 4 dd 0x200
+clip_10bit:times 8 dw 0x3ff
 
 cextern pw_3
 cextern pw_16
@@ -172,6 +174,48 @@ cglobal put_signed_rect_clamped_%1, 5,9,3, dst, 
dst_stride, src, src_stride, w,
 RET
 %endm
 
+%macro PUT_RECT_10 0
+; void put_signed_rect_clamped_10(uint8_t *dst, int dst_stride, const uint8_t 
*src, int src_stride, int width, int height)
+cglobal put_signed_rect_clamped_10, 6, 9, 6, dst, dst_stride, src, src_stride, 
w, h
+
+neg  wq
+neg  hq
+mov  r6, srcq
+mov  r7, dstq
+mov  r8, wq
+pxor m2, m2
+mova m3, [clip_10bit]
+mova m4, [convert_to_unsigned_10bit]
+
+.loop_h:
+mov  srcq, r6
+mov  dstq, r7
+mov  wq,   r8
+
+.loop_w:
+movu m0, [srcq+0*mmsize]
+movu m1, [srcq+1*mmsize]
+
+padddm0, m4
+padddm1, m4
+packusdw m0, m0, m1
+CLIPWm0, m2, m3 ; packusdw saturates so it's fine
+
+movu [dstq], m0
+
+add  srcq, 2*mmsize
+add  dstq, 1*mmsize
+add  wq, 8
+jl   .loop_w
+
+add  r6, src_strideq
+add  r7, dst_strideq
+add  hq, 1
+jl   .loop_h
+
+RET
+%endm
+
 %macro ADD_RECT 1
 ; void add_rect_clamped(uint8_t *dst, uint16_t *src, int stride, int16_t 
*idwt, int idwt_stride, int width, int height)
 cglobal add_rect_clamped_%1, 7,9,3, dst, src, stride, idwt, idwt_stride, w, h
@@ -263,3 +307,6 @@ ADD_RECT sse2
 HPEL_FILTER sse2
 ADD_OBMC 32, sse2
 ADD_OBMC 16, sse2
+
+INIT_XMM sse4
+PUT_RECT_10
diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index 5fae798..4786eea 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -46,6 +46,8 @@ void ff_put_rect_clamped_sse2(uint8_t *dst, int dst_stride, 
const int16_t *src,
 void ff_put_signed_rect_clamped_mmx(uint8_t *dst, int dst_stride, const 
int16_t *src, int src_stride, int width, int height);
 void ff_put_signed_rect_clamped_sse2(uint8_t *dst, int dst_stride, const 
int16_t *src, int src_stride, int width, int height);
 
+void ff_put_signed_rect_clamped_10_sse4(uint8_t *dst, int dst_stride, const 
uint8_t *src, int src_stride, int width, int height);
+
 #if HAVE_YASM
 
 #define HPEL_FILTER(MMSIZE, EXT)   
  \
@@ -184,4 +186,8 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
 c->put_dirac_pixels_tab[2][0] = ff_put_dirac_pixels32_sse2;
 c->avg_dirac_pixels_tab[2][0] = ff_avg_dirac_pixels32_sse2;
 }
+
+if (EXTERNAL_SSE4(mm_flags)) {
+c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4;
+}
 }
-- 
2.8.1.369.geae769a

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 02/10] diracdsp: add dequantization SIMD

2016-06-23 Thread Rostislav Pehlivanov
Currently unused, to be used in the following commits.

Signed-off-by: Rostislav Pehlivanov 
---
 libavcodec/diracdsp.c  | 24 
 libavcodec/diracdsp.h  |  4 
 libavcodec/x86/diracdsp.asm| 41 +
 libavcodec/x86/diracdsp_init.c |  4 +++-
 4 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/libavcodec/diracdsp.c b/libavcodec/diracdsp.c
index ab8d149..d0cfd00 100644
--- a/libavcodec/diracdsp.c
+++ b/libavcodec/diracdsp.c
@@ -189,6 +189,27 @@ static void add_rect_clamped_c(uint8_t *dst, const 
uint16_t *src, int stride,
 }
 }
 
+#define DEQUANT_SUBBAND(PX)
\
+static void dequant_subband_ ## PX ## _c(uint8_t *src, uint8_t *dst, ptrdiff_t 
stride, \
+ const int qf, const int qs, int64_t 
tot_v, int64_t tot_h) \
+{  
\
+int i, y;  
\
+for (y = 0; y < tot_v; y++) {  
\
+PX c, sign, *src_r = (PX *)src, *dst_r = (PX *)dst;
\
+for (i = 0; i < tot_h; i++) {  
\
+c = *src_r++;  
\
+sign = FFSIGN(c)*(!!c);
\
+c = (FFABS(c)*qf + qs) >> 2;   
\
+*dst_r++ = c*sign; 
\
+}  
\
+src += tot_h << (sizeof(PX) >> 1); 
\
+dst += stride; 
\
+}  
\
+}
+
+DEQUANT_SUBBAND(int16_t)
+DEQUANT_SUBBAND(int32_t)
+
 #define PIXFUNC(PFX, WIDTH) \
 c->PFX ## _dirac_pixels_tab[WIDTH>>4][0] = ff_ ## PFX ## _dirac_pixels ## 
WIDTH ## _c; \
 c->PFX ## _dirac_pixels_tab[WIDTH>>4][1] = ff_ ## PFX ## _dirac_pixels ## 
WIDTH ## _l2_c; \
@@ -214,6 +235,9 @@ av_cold void ff_diracdsp_init(DiracDSPContext *c)
 c->biweight_dirac_pixels_tab[1] = biweight_dirac_pixels16_c;
 c->biweight_dirac_pixels_tab[2] = biweight_dirac_pixels32_c;
 
+c->dequant_subband[0] = c->dequant_subband[2] = dequant_subband_int16_t_c;
+c->dequant_subband[1] = c->dequant_subband[3] = dequant_subband_int32_t_c;
+
 PIXFUNC(put, 8);
 PIXFUNC(put, 16);
 PIXFUNC(put, 32);
diff --git a/libavcodec/diracdsp.h b/libavcodec/diracdsp.h
index 25a872d..c0ac56b 100644
--- a/libavcodec/diracdsp.h
+++ b/libavcodec/diracdsp.h
@@ -22,6 +22,7 @@
 #define AVCODEC_DIRACDSP_H
 
 #include 
+#include 
 
 typedef void (*dirac_weight_func)(uint8_t *block, int stride, int log2_denom, 
int weight, int h);
 typedef void (*dirac_biweight_func)(uint8_t *dst, const uint8_t *src, int 
stride, int log2_denom, int weightd, int weights, int h);
@@ -46,6 +47,9 @@ typedef struct {
 void (*add_rect_clamped)(uint8_t *dst/*align 16*/, const uint16_t 
*src/*align 16*/, int stride, const int16_t *idwt/*align 16*/, int idwt_stride, 
int width, int height/*mod 2*/);
 void (*add_dirac_obmc[3])(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 
+/* 0-1: int16_t and int32_t asm/c, 2-3: int16 and int32_t, C only */
+void (*dequant_subband[4])(uint8_t *src, uint8_t *dst, ptrdiff_t stride, 
const int qf, const int qs, int64_t tot_v, int64_t tot_h);
+
 dirac_weight_func weight_dirac_pixels_tab[3];
 dirac_biweight_func biweight_dirac_pixels_tab[3];
 } DiracDSPContext;
diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index 9db7b67..f743363 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -289,6 +289,46 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, 
obmc, yblen
 RET
 %endm
 
+%macro DEQUANT_SUBBAND_32 0
+; void dequant_subband_32(uint8_t *src, uint8_t *dst, ptrdiff_t stride, const 
int qf, const int qs, int64_t tot_v, int64_t tot_h)
+cglobal dequant_subband_32, 7, 9, 4, src, dst, stride, qf, qs, tot_v, tot_h
+
+movd   m2, qfd
+movd   m3, qsd
+SPLATD m2
+SPLATD m3
+negtot_vq
+negtot_hq
+movr7, dstq
+movr8, tot_hq
+
+.loop_v:
+movdstq,   r7
+movtot_hq, r8
+
+.loop_h:
+movu   m0, [srcq]
+
+pabsd  m1, m0
+pmulld m1, 

[FFmpeg-devel] [PATCH 2/2] avformat/udp: replace packet_gap with bitrate option

2016-06-23 Thread Ali KIZIL
On Mon, 13 Jun 2016, Michael Niedermayer wrote:

>* On Sun, Jun 12, 2016 at 09:30:18PM +0200, Marton Balint wrote:
*

>>* We haven't had a stable release since the packet_gap addition, so probably 
>>it
*

>>* is worth reworking the option to something that makes more sense to the end
*

>>* user. Also add burst_bits option to specify maximum length of bit bursts.
*

>>

>>* Signed-off-by: Marton Balint >>
*

>>* ---
*

>>*  doc/protocols.texi|  9 +++--
*

>>*  libavformat/udp.c | 51 
>>+--
*

>>*  libavformat/version.h |  2 +-
*

>>*  3 files changed, 41 insertions(+), 21 deletions(-)
*

>

>* iam in favor of both patches
*

>

Thanks, pushed both.

Regards,

Marton

It will be great if this patch also applied to RTP output.

Kind Regards,

The contents of this e-mail are confidential to the addressee and are
intended solely for the recipients use. If you are not the addressee, you
have received this e-mail in error. Any disclosure, copying, distribution
or action taken in reliance on it is prohibited and may be unlawful. Please
note that any opinions expressed in this e-mail are those of the author
personally and not your business name who do not accept responsibility for
the contents of the message.

To conserve our resources for the future please reconsider before printing
this e-mail.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Mix image sequence to video & dshow to audio

2016-06-23 Thread Roger Pack
On 6/23/16, Gábor Alsecz  wrote:
> Dear All,
>
> I am on a Windows machine and have no idea how can i:
> - Grab input sound from attached mic (usb) without dshow device
> OR
> - mix up the following command with dshow to get audio input from attached
> mic.
>
> ffmpeg -loop 1 -i 2K_1.jpg -vcodec libx264 -preset medium -maxrate 3000k
> -bufsize 6000k -pix_fmt yuv420p -g 50 -c:a aac -b:a 128k -ac 2 -ar 44100
> output.mp4
>
>
> Any idea?

you just use two "-i"
so like
... -i 2K_1.jpg -f dshow -i audio="Microphone" ...
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] doc: add Libav merge document

2016-06-23 Thread Clément Bœsch
From: Clément Bœsch 

---
Very incomplete, maybe splittable (split out the 3 first sections somewhere as
an announce on the website)

Comments from other people who have done merges in the past very welcome,
notably on the last 2 sections.
---
 doc/libav-merge.txt | 109 
 1 file changed, 109 insertions(+)
 create mode 100644 doc/libav-merge.txt

diff --git a/doc/libav-merge.txt b/doc/libav-merge.txt
new file mode 100644
index 000..2ae22b1
--- /dev/null
+++ b/doc/libav-merge.txt
@@ -0,0 +1,109 @@
+CONTEXT
+===
+
+The FFmpeg project merges all the changes from the Libav project
+(https://libav.org) since the origin of the fork (around 2011).
+
+With the exceptions of some commits due to technical/political disagreements or
+issues, the changes are merged on a more or less regular schedule (daily for
+years thanks to Michael, but more sparse nowadays).
+
+WHY
+===
+
+The majority of the active developers believe the project needs to keep this
+policy for various reasons.
+
+The most important one is that we don't want our users to have to choose
+between two distributors of libraries of the exact same name in order to have a
+different set of features and bugfixes. By taking the responsibility of
+unifying the two codebases, we allow users to benefit from the changes from the
+two teams.
+
+Today, FFmpeg has a much larger user database (we are distributed by every
+major distribution), so we consider this mission a priority.
+
+A different approach to the merge could have been to pick the changes we are
+interested in and drop most of the cosmetics and other less important changes.
+Unfortunately, this makes the following picks much harder, especially since the
+Libav project is involved in various deep API changes. As a result, we decide
+to virtually take everything done there.
+
+Any Libav developer is of course welcome anytime to contribute directly to the
+FFmpeg tree. Of course, we fully understand and are forced to accept that very
+few Libav developers are interested in doing so, but we still want to recognize
+their work. This leads us to create merge commits for every single one from
+Libav. The original commit appears totally unchanged with full authorship in
+our history (and the conflict are solved in the merge one). That way, not a
+single thing from Libav will be lost in the future in case some reunification
+happens, or that project disappears one way or another.
+
+DOWNSIDES
+=
+
+Of course, there are many downsides to this approach.
+
+- It causes a non negligible merge commits pollution. We make sure there are
+  not several level of merges entangled (we do a 1:1 merge/commit), but it's
+  still a non-linear history.
+
+- Many duplicated work. For instance, we added libavresample in our tree to
+  keep compatibility with Libav when our libswresample was already covering the
+  exact same purpose. The same thing happened for various elements such as the
+  ProRes support (but differences in features, bugs, licenses, ...). There are
+  many work to do to unify them, and any help is very much welcome.
+
+- So much manpower from both FFmpeg and Libav is lost because of this mess. We
+  know it, and we don't know how to fix it. It takes incredible time to do
+  these merges, so we have even less time to work on things we personally care
+  about. The bad vibes also do not help with keeping our developers motivated.
+
+- There is a growing technical risk factor with the merges due to the codebase
+  differing more and more.
+
+MERGE GUIDELINES
+
+
+The following gives developer guidelines on how to proceed when merging Libav 
commits.
+
+Before starting, you can reduce the risk of errors on merge conflicts by using
+a different merge conflict style:
+
+$ git config --global merge.conflictstyle diff3
+
+Here is a script to help merging the next commit in the queue:
+
+#!/bin/sh
+
+if [ "$1" != "merge" -a "$1" != "noop" ]; then
+   printf "Usage: $0 \n"
+   exit 0
+fi
+
+[ "$1" = "noop" ] && merge_opts="-s ours"
+
+nextrev=$(git rev-list libav/master --not master --no-merges | tail -n1)
+if [ -z "$nextrev" ]; then
+printf "Nothing to merge..\n"
+exit 0
+fi
+printf "Merging $(git log -n 1 --oneline $nextrev)\n"
+git merge --no-commit $merge_opts --no-ff --log $nextrev
+
+if [ "$1" = "noop" -a -n "$2" ]; then
+   printf "\nThis commit is a noop, see $2\n" >> .git/MERGE_MSG
+fi
+
+printf "\nMerged-by: $(git config --get user.name) <$(git config --get 
user.email)>\n" >> .git/MERGE_MSG
+
+
+The script assumes a remote named libav.
+
+It has two modes: merge, and noop. The noop mode creates a merge with no change
+to the HEAD. You can pass a hash as extra argument to reference a justification
+(it is common that we already have the change done in FFmpeg).
+
+TODO/FIXME/UNMERGED

Re: [FFmpeg-devel] [PATCH 8/8] avdevice/decklink: add support for setting input packet timestamp source

2016-06-23 Thread Matthias Hunstock
Am 23.06.2016 um 02:47 schrieb Marton Balint:

> diff --git a/doc/indevs.texi b/doc/indevs.texi
> [...]
> +@item video_pts
> +Sets the video packet timestamp source. Must @samp{video}, @samp{audio},
> +@samp{reference} or @samp{wallclock}. Defaults to @samp{video}.
> +
> +@item audio_pts
> +Sets the audio packet timestamp source. Must @samp{video}, @samp{audio},
> +@samp{reference} or @samp{wallclock}. Defaults to @samp{audio}.
> +
>  @end table

Two small typos: "... Must _be_ @samp{video}, ..."


Matthias


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] h264: make H264ParamSets sps const

2016-06-23 Thread Benoit Fouet

Hi,


On 21/06/2016 16:42, Benoit Fouet wrote:

Hi,

On 21/06/2016 16:29, Hendrik Leppkes wrote:
On Tue, Jun 21, 2016 at 4:20 PM, Benoit Fouet  
wrote:

Hi,


On 21/06/2016 14:52, Hendrik Leppkes wrote:

On Tue, Jun 21, 2016 at 2:40 PM, Clément Bœsch  wrote:

On Tue, Jun 21, 2016 at 02:34:33PM +0200, Benoit Fouet wrote:

Hi,

Unless I totally missed something, the FIXME in H264ParamSets 
structure

should be fixed by attached patch.

--
Ben

  From 28ae10498f81070539bdb8f40236326743350101 Mon Sep 17 
00:00:00 2001

From: Benoit Fouet 
Date: Tue, 21 Jun 2016 14:17:13 +0200
Subject: [PATCH] h264: make H264ParamSets sps const

---
   libavcodec/h264.h   | 3 +--
   libavcodec/h264_slice.c | 2 +-
   2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/libavcodec/h264.h b/libavcodec/h264.h
index c4d2921..b809ee5 100644
--- a/libavcodec/h264.h
+++ b/libavcodec/h264.h
@@ -234,8 +234,7 @@ typedef struct H264ParamSets {
   AVBufferRef *sps_ref;
   /* currently active parameters sets */
   const PPS *pps;
-// FIXME this should properly be const
-SPS *sps;
+const SPS *sps;
   } H264ParamSets;

   /**
diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c
index 6e7b940..da7f9dd 100644
--- a/libavcodec/h264_slice.c
+++ b/libavcodec/h264_slice.c
@@ -873,7 +873,7 @@ static enum AVPixelFormat
get_pixel_format(H264Context *h, int force_callback)
   /* export coded and cropped frame dimensions to AVCodecContext */
   static int init_dimensions(H264Context *h)
   {
-SPS *sps = h->ps.sps;
+SPS *sps = (SPS*)h->ps.sps_ref->data;
   int width  = h->width  - (sps->crop_right + sps->crop_left);
   int height = h->height - (sps->crop_top   + 
sps->crop_bottom);
   av_assert0(sps->crop_right + sps->crop_left < 
(unsigned)h->width);

So it's not actually const, right?


Indeed, the FIXME wasn't just there because someone forgot to write
"const" in front of it, but because it was used in some parts as
not-const.


OK, right... Thanks for reminding me of reading the code better before
sending a patch.

As far as I can see, the only place where this constness is not 
preserved is
in the init_dimensions function (in h264_slice), in a dead part of 
the code,

as crop is asserted at the beginning of the very same function.
Please correct me if I've missed other places.


If anything the asserts should probably be removed, because bad files
should never be able to trigger assertions, and the existing check
remain.


Well, the SPS "decoder" already takes care of the check (see 
ff_h264_decode_seq_parameter_set).
So I could remove the check, because it seems useless, instead of 
removing it because "bad things happen", what do you think?




Any objection to this patch now?

Thanks,
--
Ben
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] lavc/mediacodec: improve end of stream handling

2016-06-23 Thread Matthieu Bouron
On Wed, Jun 22, 2016 at 01:10:54PM +0200, Matthieu Bouron wrote:
> On Tue, Jun 21, 2016 at 02:41:19PM +0200, Matthieu Bouron wrote:
> > Hello,
> > 
> > The following patchset improve handling of EOS (End Of Stream) in the
> > mediacodec decoder.
> > 
> > The decoder now relies on the relevant buffer info flags to detect EOS 
> > instead
> > of counting queued/dequeued buffers.
> > 
> > It also fixes a potential loss of frames while flushing (draining) the
> > remaining frames from the decoder.
> 
> If there is no objection, I will push the patchset in one day.

Patchset pushed.

[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] avformat/mov: improve qt metadata reading and writing

2016-06-23 Thread Kevin Wheatley
On Thu, Jun 23, 2016 at 10:44 AM, Michael Niedermayer
 wrote:
> that was maybe forgotten due to the rfc in the subject

I never pushed for it as our internal requirement for the feature went
away and I only try to push things we actually found useful :-) the
patch probably won't merge cleanly now anyway.

The other part of the problem is ideally I'd like to output the
metadata correctly, but to do that ffmpeg would need to understand
multiple data types better (or at least when I looked at it last this
was the case). I also remember there maybe similar comments about
libav needing more robust metadata support too.

Kevin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] avformat/mov: improve qt metadata reading and writing

2016-06-23 Thread Michael Niedermayer
On Wed, Jun 22, 2016 at 07:12:48PM +0200, David Murmann wrote:
> Hi all,
> 
> this has been brought up before, the MOV muxer/demuxer currently does not 
> support all
> possible datatypes that are allowed in keys/mdta style metadata. This is 
> specified here:
> https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/Metadata/Metadata.html#//apple_ref/doc/uid/TP4939-CH1-SW35
> 
> At the moment only data types 1 (UTF-8) and 23 (Float32) work. I have a patch 
> that adds
> support for 21 and 22, as I have found those in actual files produced by ARRI 
> cameras.
> 
> I have a second patch that adds a flag to the muxer to output keys/mdta style 
> metadata,
> to be able to keep the metadata while transcoding camera files.
> 

> This patch tries to do something similar, but is mine is shorter (though not 
> as complete)
> so hopefully easy to review:
> https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2015-September/178347.html
> (I actually found this only after writing my patches.)

that was maybe forgotten due to the rfc in the subject

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Opposition brings concord. Out of discord comes the fairest harmony.
-- Heraclitus


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/2] avformat/mov: add more datatypes in metadata handling

2016-06-23 Thread Michael Niedermayer
On Wed, Jun 22, 2016 at 07:13:00PM +0200, David Murmann wrote:
>  mov.c |   36 ++--
>  1 file changed, 34 insertions(+), 2 deletions(-)
> fc9b00e1fda6061cb7d281c8a513c09426a3cc20  
> 0001-avformat-mov-add-more-datatypes-in-metadata-handling.patch
> From 2c7a39037e93154f332e404238bd537b8a6c05ae Mon Sep 17 00:00:00 2001
> From: David Murmann 
> Date: Wed, 22 Jun 2016 15:20:33 +0200
> Subject: [PATCH 1/2] avformat/mov: add more datatypes in metadata handling
> 
> Implement variable sized big-endian integers, since these are found
> in files created by ARRI cameras.
> ---
>  libavformat/mov.c | 36 ++--
>  1 file changed, 34 insertions(+), 2 deletions(-)

applied

thanks

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec: remove libutvideo wrapper support

2016-06-23 Thread Paul B Mahol
On 6/23/16, Carl Eugen Hoyos  wrote:
> Paul B Mahol  gmail.com> writes:
>
>>  * The 10-bit decoding support is available now in native decoder.
>
> Please only keep this line.
>
> I cannot reproduce an encoder crash, the other points are not
> ok afaict.

They are all ok points.

>
> Thank you for working on this, Carl Eugen

Applied.

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] Mix image sequence to video & dshow to audio

2016-06-23 Thread Gábor Alsecz
Dear All,

I am on a Windows machine and have no idea how can i:
- Grab input sound from attached mic (usb) without dshow device
OR
- mix up the following command with dshow to get audio input from attached
mic.

ffmpeg -loop 1 -i 2K_1.jpg -vcodec libx264 -preset medium -maxrate 3000k
-bufsize 6000k -pix_fmt yuv420p -g 50 -c:a aac -b:a 128k -ac 2 -ar 44100
output.mp4


Any idea?

br,
Gabor
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec: remove libutvideo wrapper support

2016-06-23 Thread Carl Eugen Hoyos
Paul B Mahol  gmail.com> writes:

>  * The 10-bit decoding support is available now in native decoder.

Please only keep this line.

I cannot reproduce an encoder crash, the other points are not 
ok afaict.

Thank you for working on this, Carl Eugen

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [FFmpeg-cvslog] MAINTAINERS: Remove Linux / PowerPC maintainer

2016-06-23 Thread Clément Bœsch
On Thu, Jun 23, 2016 at 01:44:42AM +0200, Michael Niedermayer wrote:
> On Thu, Jun 23, 2016 at 01:01:41AM +0200, Clément Bœsch wrote:
> > On Thu, Jun 23, 2016 at 12:57:44AM +0200, Michael Niedermayer wrote:
> > > ffmpeg | branch: master | Michael Niedermayer  | 
> > > Thu Jun 23 00:49:29 2016 +0200| 
> > > [18f687f73709a3ad5bb6b6fbbdbbce6dc8a91036] | committer: Michael 
> > > Niedermayer
> > > 
> > > MAINTAINERS: Remove Linux / PowerPC maintainer
> > > 
> > > See: [FFmpeg-devel] PPC64: PowerPC Maintainer information is incorrect
> > > 
> > > Signed-off-by: Michael Niedermayer 
> > > 
> > > > http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=18f687f73709a3ad5bb6b6fbbdbbce6dc8a91036
> > > ---
> > > 
> > >  MAINTAINERS |1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index 24145da..b9fa0c5 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -552,7 +552,6 @@ AVR32   Mans Rullgard
> > >  MIPSMans Rullgard, Nedeljko Babic
> > >  Mac OS X / PowerPC  Romain Dolbeau, Guillaume Poirier
> > >  Amiga / PowerPC Colin Ward
> > > -Linux / PowerPC Luca Barbato
> > >  Windows MinGW   Alex Beregszaszi, Ramiro Polla
> > >  Windows Cygwin  Victor Paesa
> > >  Windows MSVCMatthew Oliver, Hendrik Leppkes
> > 
> > Don't you think you should remove Diego, Måns, Kostya, ... as well?
> 
> They didnt ask me to remove them, they didnt remove themselfs even
> though they could, they didnt post a patch to remove themselfs.
> No contributor said that he contacted them and they no longer maintain
> the code they are listed for. (or i missed that)

Well… I think it's common knowledge that they left the FFmpeg Project, and
with the exception of Diego, they also left the Libav project (public
declaration can be found).

-- 
Clément B.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel