Re: [FFmpeg-devel] vf_drawtext: add force_boxw_equl_textw option

2017-02-13 Thread Steven Liu
2017-02-14 15:02 GMT+08:00 su.gao :

> Add this option to force the  box width equl text width:
>
> diff --git a/libavfilter/vf_drawtext.c b/libavfilter/vf_drawtext.c
> index 0b94725..5b16cfa 100644
> --- a/libavfilter/vf_drawtext.c
> +++ b/libavfilter/vf_drawtext.c
> @@ -164,6 +164,7 @@ typedef struct DrawTextContext {
>  int use_kerning;///< font kerning is used - true/false
>  int tabsize;///< tab size
>  int fix_bounds; ///< do we let it go out of frame
> bounds - t/f
> +int force_boxw_equl_textw;///< tab size
>
>  FFDrawContext dc;
>  FFDrawColor fontcolor;  ///< foreground color
> @@ -209,6 +210,7 @@ static const AVOption drawtext_options[]= {
>  {"bordercolor", "set border color", OFFSET(bordercolor.rgba),
>  AV_OPT_TYPE_COLOR,  {.str="black"}, CHAR_MIN, CHAR_MAX, FLAGS},
>  {"shadowcolor", "set shadow color", OFFSET(shadowcolor.rgba),
>  AV_OPT_TYPE_COLOR,  {.str="black"}, CHAR_MIN, CHAR_MAX, FLAGS},
>  {"box", "set box",  OFFSET(draw_box),
>  AV_OPT_TYPE_BOOL,   {.i64=0}, 0,1   , FLAGS},
> +{"force_boxw_equl_textw",   "force the  box width equl text
> width", OFFSET(force_boxw_equl_textw),   AV_OPT_TYPE_BOOL,
>  {.i64=0}, 0,1   , FLAGS},
>  {"boxborderw",  "set box border width", OFFSET(boxborderw),
>  AV_OPT_TYPE_INT,{.i64=0}, INT_MIN,  INT_MAX , FLAGS},
>  {"line_spacing",  "set line spacing in pixels",
> OFFSET(line_spacing),   AV_OPT_TYPE_INT,{.i64=0}, INT_MIN,
> INT_MAX,FLAGS},
>  {"fontsize","set font size",OFFSET(fontsize),
>  AV_OPT_TYPE_INT,{.i64=0}, 0,INT_MAX , FLAGS},
> @@ -1298,7 +1300,11 @@ static int draw_text(AVFilterContext *ctx, AVFrame
> *frame,
>  update_color_with_alpha(s, , s->bordercolor);
>  update_color_with_alpha(s,, s->boxcolor   );
>
> -box_w = FFMIN(width - 1 , max_text_line_w);
> +if(0==s->force_boxw_equl_textw){
> +box_w = FFMIN(width - 1 , max_text_line_w);
> +}else{
> +box_w =   (int)s->var_values[VAR_TEXT_W];
> +}
>
>
>
>
>
>
>
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
don't use git diff make patch please,
You can use git commit and git format-patch make patch and use git
send-email send patch to maillist.

please refer to : https://ffmpeg.org/developer.html#toc-Submitting-patches-1
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v2 0/5] A native Opus encoder

2017-02-13 Thread Rostislav Pehlivanov
On 13 February 2017 at 19:42, Rostislav Pehlivanov 
wrote:

>
>
> On 11 February 2017 at 00:25, Rostislav Pehlivanov 
> wrote:
>
>> This series of commits implements a native Opus encoder.
>>
>> This is v2 of the patchset, the changes are:
>> - The forward MDCT doesn't need a third buffer and can handle
>>   the full set of sizes the init function allows
>> - The encoder has a new faster than libopus and more accurate
>>   PVQ search algorithm and can now handle transients properly.
>>
>> Rostislav Pehlivanov (5):
>>   opus_rc: add entropy encoding functions
>>   imdct15: rename to mdct15 and add a forward transform
>>   opus_celt: move quantization and band decoding to opus_pvq.c
>>   opus_celt: rename structures to better names and reorganize them
>>   opus: add a native Opus encoder
>>
>>  configure|7 +-
>>  libavcodec/Makefile  |5 +-
>>  libavcodec/aac.h |4 +-
>>  libavcodec/aacdec.c  |2 +-
>>  libavcodec/aacdec_template.c |4 +-
>>  libavcodec/allcodecs.c   |2 +-
>>  libavcodec/mdct15.c  |  335 +
>>  libavcodec/mdct15.h  |   70 ++
>>  libavcodec/opus.h|   32 +-
>>  libavcodec/opus_celt.c   | 1540 ++
>> 
>>  libavcodec/opus_celt.h   |  164 +
>>  libavcodec/opus_pvq.c| 1157 +++
>>  libavcodec/opus_pvq.h|   41 ++
>>  libavcodec/opus_rc.c |  182 -
>>  libavcodec/opus_rc.h |   35 +-
>>  libavcodec/opusdec.c |7 +-
>>  libavcodec/opusenc.c | 1128 +++
>>  libavcodec/opustab.c |4 +
>>  libavcodec/opustab.h |4 +
>>  19 files changed, 3506 insertions(+), 1217 deletions(-)
>>  create mode 100644 libavcodec/mdct15.c
>>  create mode 100644 libavcodec/mdct15.h
>>  create mode 100644 libavcodec/opus_celt.h
>>  create mode 100644 libavcodec/opus_pvq.c
>>  create mode 100644 libavcodec/opus_pvq.h
>>  create mode 100644 libavcodec/opusenc.c
>>
>> --
>> 2.11.0.483.g087da7b7c
>>
>>
> I plan to push the patchset with all the fixes people noticed tomorrow
> unless
> someone finds something else that's wrong.
>

Applied!
Thanks for all the people who commented on it
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/h264, videotoolbox: fix use-after-free of AVFrame buffer when VT decoder fails

2017-02-13 Thread wm4
On Mon, 13 Feb 2017 18:04:10 -0800
Aman Gupta  wrote:

> From: Aman Gupta 
> 
> The videotoolbox hwaccel returns a dummy frame with a 1-byte buffer from
> alloc_frame(). In end_frame(), this buffer is replaced with the actual
> decoded data from VTDecompressionSessionDecodeFrame(). This is hacky,
> but works well enough, as long as the VT decoder returns a valid frame on
> every end_frame() call.
> 
> In the case of errors, things get more interesting. Before
> 9747219958060d8c4f697df62e7f172c2a77e6c7, the dummy 1-byte frame would
> accidentally be returned all the way up to the user. After that commit,
> the dummy frame was properly unref'd so the user received an error.
> 
> However, since that commit, VT hwaccel failures started causing random
> segfaults in the h264 decoder. This happened more often on iOS where the
> VT implementation is more likely to throw errors on bitstream anomolies.
> A recent report of this issue can be see in
> http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html
> 
> The root cause here is that between the calls to alloc_frame() and
> end_frame(), the h264 decoder will reference the dummy 1-byte frame in
> its ref_list. When the end_frame() call fails, the dummy frame is
> unref'd but still referenced in the h264 decoder. This eventually causes
> a NULL deference and segmentation fault.
> 
> This patch fixes the issue by properly discarding references to the
> dummy frame in the H264Context after the frame has been unref'd.
> ---
>  libavcodec/h264_picture.c |  3 +++
>  libavcodec/h264_refs.c| 20 ++--
>  libavcodec/h264dec.h  |  1 +
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
> index f634d2a..702ca12 100644
> --- a/libavcodec/h264_picture.c
> +++ b/libavcodec/h264_picture.c
> @@ -177,6 +177,9 @@ int ff_h264_field_end(H264Context *h, H264SliceContext 
> *sl, int in_setup)
>  if (err < 0)
>  av_log(avctx, AV_LOG_ERROR,
> "hardware accelerator failed to decode picture\n");
> +
> +if (err < 0 && avctx->hwaccel->pix_fmt == AV_PIX_FMT_VIDEOTOOLBOX)
> +ff_h264_remove_cur_pic_ref(h);
>  }
>  
>  #if FF_API_CAP_VDPAU
> diff --git a/libavcodec/h264_refs.c b/libavcodec/h264_refs.c
> index 97bf588..9c77bc7 100644
> --- a/libavcodec/h264_refs.c
> +++ b/libavcodec/h264_refs.c
> @@ -560,6 +560,23 @@ static H264Picture *remove_long(H264Context *h, int i, 
> int ref_mask)
>  return pic;
>  }
>  
> +void ff_h264_remove_cur_pic_ref(H264Context *h)
> +{
> +int j;
> +
> +if (h->short_ref[0] == h->cur_pic_ptr) {
> +remove_short_at_index(h, 0);
> +}
> +
> +if (h->cur_pic_ptr->long_ref) {
> +for (j = 0; j < FF_ARRAY_ELEMS(h->long_ref); j++) {
> +if (h->long_ref[j] == h->cur_pic_ptr) {
> +remove_long(h, j, 0);
> +}
> +}
> +}
> +}
> +
>  void ff_h264_remove_all_refs(H264Context *h)
>  {
>  int i;
> @@ -571,8 +588,7 @@ void ff_h264_remove_all_refs(H264Context *h)
>  
>  if (h->short_ref_count && !h->last_pic_for_ec.f->data[0]) {
>  ff_h264_unref_picture(h, >last_pic_for_ec);
> -if (h->short_ref[0]->f->buf[0])
> -ff_h264_ref_picture(h, >last_pic_for_ec, h->short_ref[0]);
> +ff_h264_ref_picture(h, >last_pic_for_ec, h->short_ref[0]);
>  }
>  
>  for (i = 0; i < h->short_ref_count; i++) {
> diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
> index 5f868b7..063afed 100644
> --- a/libavcodec/h264dec.h
> +++ b/libavcodec/h264dec.h
> @@ -569,6 +569,7 @@ int ff_h264_alloc_tables(H264Context *h);
>  int ff_h264_decode_ref_pic_list_reordering(H264SliceContext *sl, void 
> *logctx);
>  int ff_h264_build_ref_list(H264Context *h, H264SliceContext *sl);
>  void ff_h264_remove_all_refs(H264Context *h);
> +void ff_h264_remove_cur_pic_ref(H264Context *h);
>  
>  /**
>   * Execute the reference picture marking (memory management control 
> operations).

Still a bit hacky, but if it improves the crash situation, why not.

I'm wondering if it might be better to just check for a dummy buffer
before returning a frame to the user, though. (Instead of strictly
removing the dummy buffer on the end_frame hwaccel callback.)
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] deduplicated [PATCH] Cinepak: speed up decoding several-fold, depending on the scenario, by supporting multiple output pixel formats.

2017-02-13 Thread wm4
On Mon, 13 Feb 2017 18:51:39 +0100
u-9...@aetey.se wrote:

> Then abstracting a "mini-swscale" could become justifiable.

And this is why we should probably reject this patch. What you wrote
paints a horrifying future.

Note that we would have this discussion even if it'd speed up the h264
decoder. Pissing all over modularization is not a good thing to do.

If you really want to get anything applied, you should probably try
looking at outputting ycgco, which appears to be the native colorspace
of the codec (and convert it vf_colorspace, I guess).
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg_filter: initialize cuvid for filter_complex

2017-02-13 Thread Hendrik Leppkes
On Mon, Feb 13, 2017 at 11:00 PM, Timo Rothenpieler
 wrote:
>> It is problem in NVENC.
>>
>> You create first frame before initialization of NVENC in CUVID, so this
>> first frame is not accesible to NVENC until
>> dl_fn->cuda_dl->cuCtxPushCurrent(ctx->cu_context) is called in NVENC.
>>
>> This trivial patch should fix your problem.
>>
>> M.
>
> Very interesting. I don't think this patch is the proper fix though.
> There never should be an active cuda context when returning from a
> function, at least that's the premise under which I wrote all cuda
> related functions so far.
>
> This must mean that before, cuvid or something else must somehow have
> leaked a bound cuda context to nvenc. So that might need fixing as well.
>

Indeed having an implicit context active would be rather fragile, so
best would be to revisit both cuvid and nvenc and make sure contexts
are explicitly pushed and poped wherever needed - but I assume thats
what you have planned to do now already. ;)

This reminds me of this patch from Libav which landed a couple weeks ago:
https://git.libav.org/?p=libav.git;a=commitdiff;h=fb59f87ce72035b940c3f5045884098b9324e1b2

Its hardly complete and only handling it in one place, but its
probably fixing a similar issue.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] AVFMT_FLAG_NOBUFFER not working on ffmpeg2.8 branch

2017-02-13 Thread Shi Qiu
libavformat/utils.c,line 3308:

if (ic->flags & AVFMT_FLAG_NOBUFFER)
> free_packet_buffer(>internal->packet_buffer,
>>internal->packet_buffer_end);
> {
> pkt = add_to_pktbuf(>internal->packet_buffer, ,
> >internal->packet_buffer_end);
> if (!pkt) {
> ret = AVERROR(ENOMEM);
> goto find_stream_info_err;
> }
> if ((ret = av_dup_packet(pkt)) < 0)
> goto find_stream_info_err;
> }

should be:

> if (ic->flags & AVFMT_FLAG_NOBUFFER)
> {
> free_packet_buffer(>internal->packet_buffer,
>>internal->packet_buffer_end);
> }
> else
> {
> pkt = add_to_pktbuf(>internal->packet_buffer, ,
> >internal->packet_buffer_end);
> if (!pkt) {
> ret = AVERROR(ENOMEM);
> goto find_stream_info_err;
> }
> if ((ret = av_dup_packet(pkt)) < 0)
> goto find_stream_info_err;
> }
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] hlsenc: intialize only on ref_pkt (v2)

2017-02-13 Thread Steven Liu
2017-02-13 21:15 GMT+08:00 Miroslav Slugeň :

> This patch will fix cutting hls segments into exactly same length. Because
> it will intialize only on first ref_packet, which is video frame, not audio
> frame (old behavior)
>
> Now it should be possible to create segments at exactly same length if we
> use new -force_key_frames hls:time_in_seconds parameter.
>
> This is required to support adaptive HLS.
>
> This patch was splitted to two parts, this is first independent part
>
> --
> Miroslav Slugeň
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Patch can compile passed, but i cannot reproduce the problem before your
patch, can you send a reproduce step and test file?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] drawtext option

2017-02-13 Thread su.gao
I found that the box_length is  shorter  than the text_length, and there is no 
option to adapt(Equal).

So  I suggest ffmpeg.org can add a opption(like boxw_equal_textw) to adapt it !

the old source:
box_w = FFMIN(width - 1 , max_text_line_w);
box_h = FFMIN(height - 1, y + s->max_glyph_h);

the new source:
if(0==s->boxw_equal_textw){
  box_w = FFMIN(width - 1 , max_text_line_w);
}else{
  box_w =   (int)s->var_values[VAR_TEXT_W];   
}




___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avcodec/h264, videotoolbox: fix use-after-free of AVFrame buffer when VT decoder fails

2017-02-13 Thread Aman Gupta
From: Aman Gupta 

The videotoolbox hwaccel returns a dummy frame with a 1-byte buffer from
alloc_frame(). In end_frame(), this buffer is replaced with the actual
decoded data from VTDecompressionSessionDecodeFrame(). This is hacky,
but works well enough, as long as the VT decoder returns a valid frame on
every end_frame() call.

In the case of errors, things get more interesting. Before
9747219958060d8c4f697df62e7f172c2a77e6c7, the dummy 1-byte frame would
accidentally be returned all the way up to the user. After that commit,
the dummy frame was properly unref'd so the user received an error.

However, since that commit, VT hwaccel failures started causing random
segfaults in the h264 decoder. This happened more often on iOS where the
VT implementation is more likely to throw errors on bitstream anomolies.
A recent report of this issue can be see in
http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html

The root cause here is that between the calls to alloc_frame() and
end_frame(), the h264 decoder will reference the dummy 1-byte frame in
its ref_list. When the end_frame() call fails, the dummy frame is
unref'd but still referenced in the h264 decoder. This eventually causes
a NULL deference and segmentation fault.

This patch fixes the issue by properly discarding references to the
dummy frame in the H264Context after the frame has been unref'd.
---
 libavcodec/h264_picture.c |  3 +++
 libavcodec/h264_refs.c| 20 ++--
 libavcodec/h264dec.h  |  1 +
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
index f634d2a..702ca12 100644
--- a/libavcodec/h264_picture.c
+++ b/libavcodec/h264_picture.c
@@ -177,6 +177,9 @@ int ff_h264_field_end(H264Context *h, H264SliceContext *sl, 
int in_setup)
 if (err < 0)
 av_log(avctx, AV_LOG_ERROR,
"hardware accelerator failed to decode picture\n");
+
+if (err < 0 && avctx->hwaccel->pix_fmt == AV_PIX_FMT_VIDEOTOOLBOX)
+ff_h264_remove_cur_pic_ref(h);
 }
 
 #if FF_API_CAP_VDPAU
diff --git a/libavcodec/h264_refs.c b/libavcodec/h264_refs.c
index 97bf588..9c77bc7 100644
--- a/libavcodec/h264_refs.c
+++ b/libavcodec/h264_refs.c
@@ -560,6 +560,23 @@ static H264Picture *remove_long(H264Context *h, int i, int 
ref_mask)
 return pic;
 }
 
+void ff_h264_remove_cur_pic_ref(H264Context *h)
+{
+int j;
+
+if (h->short_ref[0] == h->cur_pic_ptr) {
+remove_short_at_index(h, 0);
+}
+
+if (h->cur_pic_ptr->long_ref) {
+for (j = 0; j < FF_ARRAY_ELEMS(h->long_ref); j++) {
+if (h->long_ref[j] == h->cur_pic_ptr) {
+remove_long(h, j, 0);
+}
+}
+}
+}
+
 void ff_h264_remove_all_refs(H264Context *h)
 {
 int i;
@@ -571,8 +588,7 @@ void ff_h264_remove_all_refs(H264Context *h)
 
 if (h->short_ref_count && !h->last_pic_for_ec.f->data[0]) {
 ff_h264_unref_picture(h, >last_pic_for_ec);
-if (h->short_ref[0]->f->buf[0])
-ff_h264_ref_picture(h, >last_pic_for_ec, h->short_ref[0]);
+ff_h264_ref_picture(h, >last_pic_for_ec, h->short_ref[0]);
 }
 
 for (i = 0; i < h->short_ref_count; i++) {
diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
index 5f868b7..063afed 100644
--- a/libavcodec/h264dec.h
+++ b/libavcodec/h264dec.h
@@ -569,6 +569,7 @@ int ff_h264_alloc_tables(H264Context *h);
 int ff_h264_decode_ref_pic_list_reordering(H264SliceContext *sl, void *logctx);
 int ff_h264_build_ref_list(H264Context *h, H264SliceContext *sl);
 void ff_h264_remove_all_refs(H264Context *h);
+void ff_h264_remove_cur_pic_ref(H264Context *h);
 
 /**
  * Execute the reference picture marking (memory management control 
operations).
-- 
2.10.1 (Apple Git-78)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avformat/http: Check for truncated buffers in http_connect()

2017-02-13 Thread Steven Liu
2017-02-13 20:46 GMT+08:00 Michael Niedermayer :

> Reported-by: SleepProgger 
> Signed-off-by: Michael Niedermayer 
> ---
>  libavformat/http.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/http.c b/libavformat/http.c
> index 944a6cf322..bd1be3f7bb 100644
> --- a/libavformat/http.c
> +++ b/libavformat/http.c
> @@ -1011,6 +1011,7 @@ static int http_connect(URLContext *h, const char
> *path, const char *local_path,
>  int len = 0;
>  const char *method;
>  int send_expect_100 = 0;
> +int ret;
>
>  /* send http header */
>  post = h->flags & AVIO_FLAG_WRITE;
> @@ -1107,7 +1108,7 @@ static int http_connect(URLContext *h, const char
> *path, const char *local_path,
>  if (s->headers)
>  av_strlcpy(headers + len, s->headers, sizeof(headers) - len);
>
> -snprintf(s->buffer, sizeof(s->buffer),
> +ret = snprintf(s->buffer, sizeof(s->buffer),
>   "%s %s HTTP/1.1\r\n"
>   "%s"
>   "%s"
> @@ -1123,6 +1124,14 @@ static int http_connect(URLContext *h, const char
> *path, const char *local_path,
>
>  av_log(h, AV_LOG_DEBUG, "request: %s\n", s->buffer);
>
> +if (strlen(headers) + 1 == sizeof(headers) ||
> +ret >= sizeof(s->buffer)) {
> +av_log(h, AV_LOG_ERROR, "overlong headers\n");
> +err = AVERROR(EINVAL);
> +goto done;
> +}
> +
> +
>  if ((err = ffurl_write(s->hd, s->buffer, strlen(s->buffer))) < 0)
>  goto done;
>
> --
> 2.11.0
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel



LGTM,

BTW, what about give a initial value 0 when define the ‘ret’ variable?


Thanks
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg_filter: initialize cuvid for filter_complex

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 23:00 Timo Rothenpieler napsal(a):

It is problem in NVENC.

You create first frame before initialization of NVENC in CUVID, so this
first frame is not accesible to NVENC until
dl_fn->cuda_dl->cuCtxPushCurrent(ctx->cu_context) is called in NVENC.

This trivial patch should fix your problem.

M.

Very interesting. I don't think this patch is the proper fix though.
There never should be an active cuda context when returning from a
function, at least that's the premise under which I wrote all cuda
related functions so far.

This must mean that before, cuvid or something else must somehow have
leaked a bound cuda context to nvenc. So that might need fixing as well.

Thank you very much for finding this!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Try to look at: 
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#module


Maybe someone from nvidia will be able to explain it more

M.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg_filter: initialize cuvid for filter_complex

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 11:18 Timo Rothenpieler napsal(a):

That's what it looks like for me:
https://bpaste.net/show/890855410dac

Happens on two independend machines, on both Windows using MSVC and
Linux with gcc.
Both machines are definitely nowehre near out of memory, on either
system or device memory.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Can't reproduce it on two my systems with same sample and same command
line.

First 1000 lines:
375.26: https://bpaste.net/show/bed97b3e0287
378.09: https://bpaste.net/show/912c042036cd

Configuration1:
Debian Jessie Linux desktop 4.8.0-0.bpo.2-amd64 #1 SMP Debian
4.8.15-2~bpo8+2 (2017-01-17) x86_64 GNU/Linux
GeForce GTX 1060, drivers 375.26

Configuration2:
Debian Jessie Linux pascal 4.7.0-0.bpo.1-amd64 #1 SMP Debian
4.7.8-1~bpo8+1 (2016-10-19) x86_64 GNU/Linux
GeForce GTX 1080, drivers 378.09

That's not built from the right branch.
Most notably: On the filter-merge branch, the cuvid pfnSequenceCallback
happens before the "Nvenc initialized successfully", on your log Nvenc
still gets initialized first.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

It is problem in NVENC.

You create first frame before initialization of NVENC in CUVID, so this 
first frame is not accesible to NVENC until 
dl_fn->cuda_dl->cuCtxPushCurrent(ctx->cu_context) is called in NVENC.


This trivial patch should fix your problem.

M.
diff -Nurp a/libavcodec/nvenc.c b/libavcodec/nvenc.c
--- a/libavcodec/nvenc.c	2017-02-13 22:22:37.627309692 +0100
+++ b/libavcodec/nvenc.c	2017-02-13 22:16:09.0 +0100
@@ -426,6 +426,8 @@ static av_cold int nvenc_setup_device(AV
 av_log(avctx, AV_LOG_FATAL, "Provided device doesn't support required NVENC features\n");
 return ret;
 }
+
+dl_fn->cuda_dl->cuCtxPushCurrent(ctx->cu_context);
 } else {
 int i, nb_devices = 0;
 
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v2 0/5] A native Opus encoder

2017-02-13 Thread Rostislav Pehlivanov
On 11 February 2017 at 00:25, Rostislav Pehlivanov 
wrote:

> This series of commits implements a native Opus encoder.
>
> This is v2 of the patchset, the changes are:
> - The forward MDCT doesn't need a third buffer and can handle
>   the full set of sizes the init function allows
> - The encoder has a new faster than libopus and more accurate
>   PVQ search algorithm and can now handle transients properly.
>
> Rostislav Pehlivanov (5):
>   opus_rc: add entropy encoding functions
>   imdct15: rename to mdct15 and add a forward transform
>   opus_celt: move quantization and band decoding to opus_pvq.c
>   opus_celt: rename structures to better names and reorganize them
>   opus: add a native Opus encoder
>
>  configure|7 +-
>  libavcodec/Makefile  |5 +-
>  libavcodec/aac.h |4 +-
>  libavcodec/aacdec.c  |2 +-
>  libavcodec/aacdec_template.c |4 +-
>  libavcodec/allcodecs.c   |2 +-
>  libavcodec/mdct15.c  |  335 +
>  libavcodec/mdct15.h  |   70 ++
>  libavcodec/opus.h|   32 +-
>  libavcodec/opus_celt.c   | 1540 ++
> 
>  libavcodec/opus_celt.h   |  164 +
>  libavcodec/opus_pvq.c| 1157 +++
>  libavcodec/opus_pvq.h|   41 ++
>  libavcodec/opus_rc.c |  182 -
>  libavcodec/opus_rc.h |   35 +-
>  libavcodec/opusdec.c |7 +-
>  libavcodec/opusenc.c | 1128 +++
>  libavcodec/opustab.c |4 +
>  libavcodec/opustab.h |4 +
>  19 files changed, 3506 insertions(+), 1217 deletions(-)
>  create mode 100644 libavcodec/mdct15.c
>  create mode 100644 libavcodec/mdct15.h
>  create mode 100644 libavcodec/opus_celt.h
>  create mode 100644 libavcodec/opus_pvq.c
>  create mode 100644 libavcodec/opus_pvq.h
>  create mode 100644 libavcodec/opusenc.c
>
> --
> 2.11.0.483.g087da7b7c
>
>
I plan to push the patchset with all the fixes people noticed tomorrow
unless
someone finds something else that's wrong.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): nvenc: fix wrong aspect ratio for 720x576 and 720x480 resolution

2017-02-13 Thread Philip Langdale

On 2017-02-13 08:33, Miroslav Slugeň wrote:

I am sure that i know what is going on, NVENC is inserting wrong SPS
VUI aspect_ratio_idc to h264 packets when you encode at resolution
720x576 and 720x480

AR 16:9 will insert aspect_ratio_idc=4 but it should be
aspect_ratio_idc=255, sar_width=64, sar_height=45 for 720x576
AR 4:3 will insert aspect_ratio_idc=2 but it should be
aspect_ratio_idc=255, sar_width=16, sar_height=15 for 720x576

MP4 and MKV containers contains correct AR metadata information which
all players should accept, but TS container has nothing like that so
it relies on what is inside h264 SPS.


Makes sense. So, I think this really needs fixing inside the driver.
Having to do the whole parse/rewrite thing is horrible - and that's
even after rewriting it not to re-implement the parser.

--phil
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): nvenc: fix wrong aspect ratio for 720x576 and 720x480 resolution

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 17:07 Philip Langdale napsal(a):

On Mon, 13 Feb 2017 07:21:51 -0800
Philip Langdale  wrote:


On Mon, 13 Feb 2017 08:52:34 +0100
Miroslav Slugeň  wrote:



I am using current STABLE drivers 375.26, because BETA drivers
378.09 caused some crashes while encoding on NVENC.

I tested this on BETA drivers too and it is still same.

Original workaround is not working anymore :(

INPUT: Stream #0:0[0x401]: Video: mpeg2video (Main) ([2][0][0][0] /
0x0002), yuv420p(tv, top first), 720x576 [SAR 64:45 DAR 16:9], 25
fps, 25 tbr, 90k tbn, 50 tbc

OUTPUT: Stream #0:0[0x100]: Video: h264 (Main) ([27][0][0][0] /
0x001B), yuv420p(progressive), 720x576 [SAR 16:11 DAR 20:11], 25
fps, 25 tbr, 90k tbn, 50 tbc

COMMAND: ffmpeg -deint adaptive -hwaccel cuvid -c:v mpeg2_cuvid -i
"in.ts" -y -c:v h264_nvenc -c:a copy -b:v 1M -preset hq -f mpegts
"out.ts"

Also someone else is complaining about this issue:
http://superuser.com/questions/1174097/ffmpegnvenc-encoding-strange-aspect-ratio

M.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Can you point me to a sample that you see this behaviour with? I
cannot reproduce with DVD sources here, which is where I saw the
original problem.

You sent me a sample and I tried it out. I was able to reproduce your
problem, but it is not the original problem, and I wonder what is
really going on.

If you take your sample and your command line and then output to a
different container (i tried mkv and mp4), it's all correct - and it
would not be correct if my original workaround was still required.
There's something specific about using mpegts that leads to this
problem.

I'm not familiar with what parts of what metadata get respected in
different contextx For example, the out.ts, this command line produces,
is reported as having the right aspect ratio by mediainfo, but the
wrong one by ffprobe (and then plays back wrong, obviously).

Modifying the darWidth and darHeight leads to changes that are visible
in mediainfo, and are cumulative with respect to whatever bug is
hitting you here.

So, theory - there was a bug where nvenc was distorting the DAR and
that bug is now fixed. It seems like there is now some other bug, or
perhaps there always was a bug, which is modifying the SAR. Perhaps it
is doing it all the time but in other containers, container level
metadata is overriding it so it never becomes an issue.

Sounds like a great time for an nvidia dev to chime in :-)

--phil

I am sure that i know what is going on, NVENC is inserting wrong SPS VUI 
aspect_ratio_idc to h264 packets when you encode at resolution 720x576 
and 720x480


AR 16:9 will insert aspect_ratio_idc=4 but it should be 
aspect_ratio_idc=255, sar_width=64, sar_height=45 for 720x576
AR 4:3 will insert aspect_ratio_idc=2 but it should be 
aspect_ratio_idc=255, sar_width=16, sar_height=15 for 720x576


MP4 and MKV containers contains correct AR metadata information which 
all players should accept, but TS container has nothing like that so it 
relies on what is inside h264 SPS.


Miroslav Slugeň
+420 724 825 885

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): nvenc: fix wrong aspect ratio for 720x576 and 720x480 resolution

2017-02-13 Thread Philip Langdale
On Mon, 13 Feb 2017 07:21:51 -0800
Philip Langdale  wrote:

> On Mon, 13 Feb 2017 08:52:34 +0100
> Miroslav Slugeň  wrote:
> 
> > >
> > I am using current STABLE drivers 375.26, because BETA drivers
> > 378.09 caused some crashes while encoding on NVENC.
> > 
> > I tested this on BETA drivers too and it is still same.
> > 
> > Original workaround is not working anymore :(
> > 
> > INPUT: Stream #0:0[0x401]: Video: mpeg2video (Main) ([2][0][0][0] / 
> > 0x0002), yuv420p(tv, top first), 720x576 [SAR 64:45 DAR 16:9], 25
> > fps, 25 tbr, 90k tbn, 50 tbc
> > 
> > OUTPUT: Stream #0:0[0x100]: Video: h264 (Main) ([27][0][0][0] /
> > 0x001B), yuv420p(progressive), 720x576 [SAR 16:11 DAR 20:11], 25
> > fps, 25 tbr, 90k tbn, 50 tbc
> > 
> > COMMAND: ffmpeg -deint adaptive -hwaccel cuvid -c:v mpeg2_cuvid -i 
> > "in.ts" -y -c:v h264_nvenc -c:a copy -b:v 1M -preset hq -f mpegts
> > "out.ts"
> > 
> > Also someone else is complaining about this issue: 
> > http://superuser.com/questions/1174097/ffmpegnvenc-encoding-strange-aspect-ratio
> > 
> > M.
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel  
> 
> Can you point me to a sample that you see this behaviour with? I
> cannot reproduce with DVD sources here, which is where I saw the
> original problem.

You sent me a sample and I tried it out. I was able to reproduce your
problem, but it is not the original problem, and I wonder what is
really going on.

If you take your sample and your command line and then output to a
different container (i tried mkv and mp4), it's all correct - and it
would not be correct if my original workaround was still required.
There's something specific about using mpegts that leads to this
problem.

I'm not familiar with what parts of what metadata get respected in
different contextx For example, the out.ts, this command line produces,
is reported as having the right aspect ratio by mediainfo, but the
wrong one by ffprobe (and then plays back wrong, obviously).

Modifying the darWidth and darHeight leads to changes that are visible
in mediainfo, and are cumulative with respect to whatever bug is
hitting you here.

So, theory - there was a bug where nvenc was distorting the DAR and
that bug is now fixed. It seems like there is now some other bug, or
perhaps there always was a bug, which is modifying the SAR. Perhaps it
is doing it all the time but in other containers, container level
metadata is overriding it so it never becomes an issue.

Sounds like a great time for an nvidia dev to chime in :-)

--phil
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: improve performance by reducing forward seeks

2017-02-13 Thread Joel Cunningham
Friendly ping!  Any issues receiving this updated patch (submitted with git 
send-email)?  Anyone try it out yet?

Thanks,

Joel


> On Jan 30, 2017, at 10:00 AM, Joel Cunningham  wrote:
> 
> This commit optimizes HTTP performance by reducing forward seeks, instead
> favoring a read-ahead and discard on the current connection (referred to
> as a short seek) for seeks that are within a TCP window's worth of data.
> This improves performance because with TCP flow control, a window's worth
> of data will be in the local socket buffer already or in-flight from the
> sender once congestion control on the sender is fully utilizing the window.
> 
> Note: this approach doesn't attempt to differentiate from a newly opened
> connection which may not be fully utilizing the window due to congestion
> control vs one that is. The receiver can't get at this information, so we
> assume worst case; that full window is in use (we did advertise it after all)
> and that data could be in-flight
> 
> The previous behavior of closing the connection, then opening a new
> with a new HTTP range value results in a massive amounts of discarded
> and re-sent data when large TCP windows are used.  This has been observed
> on MacOS/iOS which starts with an initial window of 256KB and grows up to
> 1MB depending on the bandwidth-product delay.
> 
> When seeking within a window's worth of data and we close the connection,
> then open a new one within the same window's worth of data, we discard
> from the current offset till the end of the window.  Then on the new
> connection the server ends up re-sending the previous data from new
> offset till the end of old window.
> 
> Example (assumes full window utilization):
> 
> TCP window size: 64KB
> Position: 32KB
> Forward seek position: 40KB
> 
>  *  (Next window)
> 32KB |--| 96KB |---| 160KB
>*
>  40KB |---| 104KB
> 
> Re-sent amount: 96KB - 40KB = 56KB
> 
> For a real world test example, I have MP4 file of ~25MB, which ffplay
> only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
> in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
> patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
> communication.
> 
> To support this feature, the short seek logic in avio_seek() has been
> extended to call a function to get the short seek threshold value.  This
> callback has been plumbed to the URLProtocol structure, which now has
> infrastructure in HTTP and TCP to get the underlying receiver window size
> via SO_RCVBUF.  If the underlying URL and protocol don't support returning
> a short seek threshold, the default s->short_seek_threshold is used
> 
> This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
> is slightly complicated by the fact that when TCP window auto-tuning is
> enabled, SO_RCVBUF doesn't report the real window size, but it does if
> SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
> this optimization on Windows in the later case
> 
> Signed-off-by: Joel Cunningham 
> ---
> libavformat/avio.c|  7 +++
> libavformat/avio.h|  6 ++
> libavformat/aviobuf.c | 19 ++-
> libavformat/http.c|  8 
> libavformat/tcp.c | 21 +
> libavformat/url.h |  8 
> 6 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/libavformat/avio.c b/libavformat/avio.c
> index 3606eb0..62233a6 100644
> --- a/libavformat/avio.c
> +++ b/libavformat/avio.c
> @@ -645,6 +645,13 @@ int ffurl_get_multi_file_handle(URLContext *h, int 
> **handles, int *numhandles)
> return h->prot->url_get_multi_file_handle(h, handles, numhandles);
> }
> 
> +int ffurl_get_short_seek(URLContext *h)
> +{
> +if (!h->prot->url_get_short_seek)
> +return AVERROR(ENOSYS);
> +return h->prot->url_get_short_seek(h);
> +}
> +
> int ffurl_shutdown(URLContext *h, int flags)
> {
> if (!h->prot->url_shutdown)
> diff --git a/libavformat/avio.h b/libavformat/avio.h
> index e2cb4af..8040094 100644
> --- a/libavformat/avio.h
> +++ b/libavformat/avio.h
> @@ -313,6 +313,12 @@ typedef struct AVIOContext {
>  */
> enum AVIODataMarkerType current_type;
> int64_t last_time;
> +
> +/**
> + * A callback that is used instead of short_seek_threshold.
> + * This is current internal only, do not use from outside.
> + */
> +int (*short_seek_get)(void *opaque);
> } AVIOContext;
> 
> /**
> diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
> index bf7e5f8..4ade4d0 100644
> --- a/libavformat/aviobuf.c
> +++ b/libavformat/aviobuf.c
> @@ -119,6 +119,7 @@ int ffio_init_context(AVIOContext *s,
> s->ignore_boundary_point = 0;
> s->current_type  = AVIO_DATA_MARKER_UNKNOWN;
> s->last_time = AV_NOPTS_VALUE;
> +s->short_seek_get= NULL;
> 
> return 0;
> }
> @@ 

Re: [FFmpeg-devel] deduplicated [PATCH] Cinepak: speed up decoding several-fold, depending on the scenario, by supporting multiple output pixel formats.

2017-02-13 Thread u-9iep
Thanks Michael,

Your corrections are appreciated.

On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote:
> you may want to add yourself to MAINTAINERs (after talking with
> roberto, who i belive has less interrest in cinepak than you do
> nowadays)

Sounds ok for me. Roberto, what do you think (if you read this)?

> > +/* #include "libavutil/avassert.h" */
> 
> useless commented out code

I hesitated but left it together with the corresponding commented out
assert() statement to serve as an indication of the validity assumption
we make for pal8. Will change.

> > +av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", 
> > avctx->pix_fmt);
> 
> av_get_pix_fmt_name()

Thanks.

> > @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = {
> >  .close  = cinepak_decode_end,
> >  .decode = cinepak_decode_frame,
> >  .capabilities   = AV_CODEC_CAP_DR1,
> > +.pix_fmts   = pixfmt_list,
> 
> This is possibly unneeded

ok!

Regards,
Rune

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH v7] - Added Turing codec interface for ffmpeg

2017-02-13 Thread Saverio Blasi
- This patch contains the changes to interface the Turing codec 
(http://turingcodec.org/) with ffmpeg. The patch was modified to address the 
comments in the review as follows:
  - Added a pkg-config file to list all dependencies required by libturing. 
This should address the issue pointed out by Hendrik Leppkes on Fri 18/11/2016
  - As per suggestions of wm4, two functions (add_option and finalise_options) 
have been created. The former appends new options while the latter sets up the 
argv array of pointers to char* accordingly. add_option re-allocates the buffer 
for options using av_realloc
  - Additionally, both these functions handle the errors in case the memory 
wasn't allocated correctly
  - malloc|free|realloc have been substituted with their corresponding 
av_{malloc|free|realloc} version
  - Check on bit-depth has been removed since the ffmpeg already casts the 
right pix_fmt and bit depth
  - pix_fmts is now set in ff_libturing_encoder as in h264dec.c.
  - Changed usage of av_free with av_freep and fixed calls to free arrays
  - Added brackets to all if and for statements
  - Avoid repetition of code to free arrays in case of failure to initialise 
the libturing encoder
  - Some fixes to address the review from wm4 and Mark Thompson received on Wed 
08/02/2017
  - Fixed indentation
---
 LICENSE.md |   1 +
 configure  |   5 +
 libavcodec/Makefile|   1 +
 libavcodec/allcodecs.c |   1 +
 libavcodec/libturing.c | 320 +
 5 files changed, 328 insertions(+)
 create mode 100644 libavcodec/libturing.c

diff --git a/LICENSE.md b/LICENSE.md
index 640633c..86e5371 100644
--- a/LICENSE.md
+++ b/LICENSE.md
@@ -85,6 +85,7 @@ The following libraries are under GPL:
 - frei0r
 - libcdio
 - librubberband
+- libturing
 - libvidstab
 - libx264
 - libx265
diff --git a/configure b/configure
index 7154142..a27cb8b 100755
--- a/configure
+++ b/configure
@@ -255,6 +255,7 @@ External library support:
   --enable-libssh  enable SFTP protocol via libssh [no]
   --enable-libtesseractenable Tesseract, needed for ocr filter [no]
   --enable-libtheora   enable Theora encoding via libtheora [no]
+  --enable-libturing   enable H.265/HEVC encoding via libturing [no]
   --enable-libtwolame  enable MP2 encoding via libtwolame [no]
   --enable-libv4l2 enable libv4l2/v4l-utils [no]
   --enable-libvidstab  enable video stabilization using vid.stab [no]
@@ -1562,6 +1563,7 @@ EXTERNAL_LIBRARY_LIST="
 libssh
 libtesseract
 libtheora
+libturing
 libtwolame
 libv4l2
 libvidstab
@@ -2858,6 +2860,7 @@ libspeex_decoder_deps="libspeex"
 libspeex_encoder_deps="libspeex"
 libspeex_encoder_select="audio_frame_queue"
 libtheora_encoder_deps="libtheora"
+libturing_encoder_deps="libturing"
 libtwolame_encoder_deps="libtwolame"
 libvo_amrwbenc_encoder_deps="libvo_amrwbenc"
 libvorbis_decoder_deps="libvorbis"
@@ -5131,6 +5134,7 @@ die_license_disabled gpl frei0r
 die_license_disabled gpl libcdio
 die_license_disabled gpl librubberband
 die_license_disabled gpl libsmbclient
+die_license_disabled gpl libturing
 die_license_disabled gpl libvidstab
 die_license_disabled gpl libx264
 die_license_disabled gpl libx265
@@ -5789,6 +5793,7 @@ enabled libssh&& require_pkg_config libssh 
libssh/sftp.h sftp_init
 enabled libspeex  && require_pkg_config speex speex/speex.h 
speex_decoder_init -lspeex
 enabled libtesseract  && require_pkg_config tesseract tesseract/capi.h 
TessBaseAPICreate
 enabled libtheora && require libtheora theora/theoraenc.h th_info_init 
-ltheoraenc -ltheoradec -logg
+enabled libturing && require_pkg_config libturing turing.h 
turing_version
 enabled libtwolame&& require libtwolame twolame.h twolame_init 
-ltwolame &&
  { check_lib twolame.h 
twolame_encode_buffer_float32_interleaved -ltwolame ||
die "ERROR: libtwolame must be installed and 
version must be >= 0.3.10"; }
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 43a6add..de5af1d 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -883,6 +883,7 @@ OBJS-$(CONFIG_LIBSHINE_ENCODER)   += libshine.o
 OBJS-$(CONFIG_LIBSPEEX_DECODER)   += libspeexdec.o
 OBJS-$(CONFIG_LIBSPEEX_ENCODER)   += libspeexenc.o
 OBJS-$(CONFIG_LIBTHEORA_ENCODER)  += libtheoraenc.o
+OBJS-$(CONFIG_LIBTURING_ENCODER)  += libturing.o
 OBJS-$(CONFIG_LIBTWOLAME_ENCODER) += libtwolame.o
 OBJS-$(CONFIG_LIBVO_AMRWBENC_ENCODER) += libvo-amrwbenc.o
 OBJS-$(CONFIG_LIBVORBIS_DECODER)  += libvorbisdec.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index f92b2b7..f650591 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -615,6 +615,7 @@ void avcodec_register_all(void)
 REGISTER_ENCODER(LIBSHINE,  libshine);
 REGISTER_ENCDEC (LIBSPEEX,  libspeex);
  

Re: [FFmpeg-devel] deduplicated [PATCH] Cinepak: speed up decoding several-fold, depending on the scenario, by supporting multiple output pixel formats.

2017-02-13 Thread Paul B Mahol
On 2/13/17, Michael Niedermayer  wrote:
> On Sat, Feb 11, 2017 at 10:25:03PM +0100, u-9...@aetey.se wrote:
>> Hello,
>>
>> This is my best effort attempt to make the patch acceptable
>> by the upstream's criteria.
>>
>> Daniel, do you mind that I referred to your message in the commit?
>> I believe is is best to indicate numbers from a third party measurement.
>>
>> The code seems to be equvalent to the previous patch,
>> with about 20% less LOC.
>>
>> This hurts readability (my subjective impression) but on the positive
>> side
>> the change makes the structure of the code more explicit.
>>
>> Attaching the patch.
>>
>> Now I have done what I can, have to leave.
>> Unless there are bugs there in the patch, my attempt to contribute ends
>> at this point.
>>
>> Thanks to everyone who cared to objectively discuss a specific case
>> of ffmpeg usage, the implications of techniques around VQ and whether/why
>> some non-traditional approaches can make sense.
>>
>> Good luck to the ffmpeg project, it is very useful and valuable.
>>
>> Best regards,
>> Rune
>
>>  cinepak.c |  844
>> ++
>>  1 file changed, 692 insertions(+), 152 deletions(-)
>> cc2ab45b7633651bc0ff80ca57c78ef4fc649d3c
>> 0001-Cinepak-speed-up-decoding-several-fold-depending-on-.patch
>> From 0c9badec5d144b995c0bb52c7a80939b672be3f5 Mon Sep 17 00:00:00 2001
>> From: Rl 
>> Date: Sat, 11 Feb 2017 20:28:54 +0100
>> Subject: [PATCH] Cinepak: speed up decoding several-fold, depending on
>> the
>>  scenario, by supporting multiple output pixel formats.
>>
>> Decoding to rgb24 and pal8 is optimized.
>>
>> Added rgb32, rgb565, yuv420p, each with faster decoding than to rgb24.
>>
>> The most noticeable gain is achieved by the created possibility
>> to skip format conversions, for example when decoding to rgb565
>> 
>> Using matrixbench_mpeg2.mpg (720x567) encoded with ffmpeg into Cinepak
>> using default settings, decoding on an i5 3570K, 3.4 GHz:
>>
>> bicubic (default):  ~24x realtime
>> fast_bilinear:  ~65x realtime
>> patch w/rgb565 override:~154x realtime
>> 
>> (https://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/206799.html)
>>
>> palettized input can be decoded to any of the output formats,
>> pal8 output is still limited to palettized input
>>
>> with input other than palettized/grayscale
>> yuv420 is approximated by the Cinepak colorspace
>>
>> The output format can be chosen at runtime by an option or via the API.
>> ---
>>  libavcodec/cinepak.c | 844
>> +--
>>  1 file changed, 692 insertions(+), 152 deletions(-)
>
> you may want to add yourself to MAINTAINERs (after talking with
> roberto, who i belive has less interrest in cinepak than you do
> nowadays)
>
>
>>
>> diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c
>> index d657e9c0c1..7b08e20e06 100644
>> --- a/libavcodec/cinepak.c
>> +++ b/libavcodec/cinepak.c
>> @@ -31,6 +31,8 @@
>>   *
>>   * Cinepak colorspace support (c) 2013 Rl, Aetey Global Technologies AB
>>   * @author Cinepak colorspace, Rl, Aetey Global Technologies AB
>> + * Extra output formats / optimizations (c) 2017 Rl, Aetey Global
>> Technologies AB
>> + * @author Extra output formats / optimizations, Rl, Aetey Global
>> Technologies AB
>>   */
>>
>>  #include 
>> @@ -39,23 +41,48 @@
>>
>>  #include "libavutil/common.h"
>>  #include "libavutil/intreadwrite.h"
>> +#include "libavutil/opt.h"
>
>> +/* #include "libavutil/avassert.h" */
>
> useless commented out code
>
> [...]
>> +switch (avctx->pix_fmt) {
>> +case AV_PIX_FMT_RGB32:
>> +s->decode_codebook = cinepak_decode_codebook_rgb32;
>> +s->decode_vectors  = cinepak_decode_vectors_rgb32;
>> +break;
>> +case AV_PIX_FMT_RGB24:
>> +s->decode_codebook = cinepak_decode_codebook_rgb24;
>> +s->decode_vectors  = cinepak_decode_vectors_rgb24;
>> +break;
>> +case AV_PIX_FMT_RGB565:
>> +s->decode_codebook = cinepak_decode_codebook_rgb565;
>> +s->decode_vectors  = cinepak_decode_vectors_rgb565;
>> +break;
>> +case AV_PIX_FMT_YUV420P:
>> +s->decode_codebook = cinepak_decode_codebook_yuv420;
>> +s->decode_vectors  = cinepak_decode_vectors_yuv420;
>> +break;
>> +case AV_PIX_FMT_PAL8:
>> +if (!s->palette_video) {
>> +av_log(avctx, AV_LOG_ERROR, "Palettized output not supported
>> without palettized input\n");
>> +return AVERROR(EINVAL);
>> +}
>> +s->decode_codebook = cinepak_decode_codebook_pal8;
>> +s->decode_vectors  = cinepak_decode_vectors_pal8;
>> +break;
>> +default:
>
>> +av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n",
>> avctx->pix_fmt);
>
> av_get_pix_fmt_name()
>
>
> [...]
>
>> @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = {
>>  .close  = cinepak_decode_end,

Re: [FFmpeg-devel] deduplicated [PATCH] Cinepak: speed up decoding several-fold, depending on the scenario, by supporting multiple output pixel formats.

2017-02-13 Thread Michael Niedermayer
On Sat, Feb 11, 2017 at 10:25:03PM +0100, u-9...@aetey.se wrote:
> Hello,
> 
> This is my best effort attempt to make the patch acceptable
> by the upstream's criteria.
> 
> Daniel, do you mind that I referred to your message in the commit?
> I believe is is best to indicate numbers from a third party measurement.
> 
> The code seems to be equvalent to the previous patch,
> with about 20% less LOC.
> 
> This hurts readability (my subjective impression) but on the positive side
> the change makes the structure of the code more explicit.
> 
> Attaching the patch.
> 
> Now I have done what I can, have to leave.
> Unless there are bugs there in the patch, my attempt to contribute ends
> at this point.
> 
> Thanks to everyone who cared to objectively discuss a specific case
> of ffmpeg usage, the implications of techniques around VQ and whether/why
> some non-traditional approaches can make sense.
> 
> Good luck to the ffmpeg project, it is very useful and valuable.
> 
> Best regards,
> Rune

>  cinepak.c |  844 
> ++
>  1 file changed, 692 insertions(+), 152 deletions(-)
> cc2ab45b7633651bc0ff80ca57c78ef4fc649d3c  
> 0001-Cinepak-speed-up-decoding-several-fold-depending-on-.patch
> From 0c9badec5d144b995c0bb52c7a80939b672be3f5 Mon Sep 17 00:00:00 2001
> From: Rl 
> Date: Sat, 11 Feb 2017 20:28:54 +0100
> Subject: [PATCH] Cinepak: speed up decoding several-fold, depending on the
>  scenario, by supporting multiple output pixel formats.
> 
> Decoding to rgb24 and pal8 is optimized.
> 
> Added rgb32, rgb565, yuv420p, each with faster decoding than to rgb24.
> 
> The most noticeable gain is achieved by the created possibility
> to skip format conversions, for example when decoding to rgb565
> 
> Using matrixbench_mpeg2.mpg (720x567) encoded with ffmpeg into Cinepak
> using default settings, decoding on an i5 3570K, 3.4 GHz:
> 
> bicubic (default):  ~24x realtime
> fast_bilinear:  ~65x realtime
> patch w/rgb565 override:~154x realtime
> 
> (https://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/206799.html)
> 
> palettized input can be decoded to any of the output formats,
> pal8 output is still limited to palettized input
> 
> with input other than palettized/grayscale
> yuv420 is approximated by the Cinepak colorspace
> 
> The output format can be chosen at runtime by an option or via the API.
> ---
>  libavcodec/cinepak.c | 844 
> +--
>  1 file changed, 692 insertions(+), 152 deletions(-)

you may want to add yourself to MAINTAINERs (after talking with
roberto, who i belive has less interrest in cinepak than you do
nowadays)


> 
> diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c
> index d657e9c0c1..7b08e20e06 100644
> --- a/libavcodec/cinepak.c
> +++ b/libavcodec/cinepak.c
> @@ -31,6 +31,8 @@
>   *
>   * Cinepak colorspace support (c) 2013 Rl, Aetey Global Technologies AB
>   * @author Cinepak colorspace, Rl, Aetey Global Technologies AB
> + * Extra output formats / optimizations (c) 2017 Rl, Aetey Global 
> Technologies AB
> + * @author Extra output formats / optimizations, Rl, Aetey Global 
> Technologies AB
>   */
>  
>  #include 
> @@ -39,23 +41,48 @@
>  
>  #include "libavutil/common.h"
>  #include "libavutil/intreadwrite.h"
> +#include "libavutil/opt.h"

> +/* #include "libavutil/avassert.h" */

useless commented out code

[...]
> +switch (avctx->pix_fmt) {
> +case AV_PIX_FMT_RGB32:
> +s->decode_codebook = cinepak_decode_codebook_rgb32;
> +s->decode_vectors  = cinepak_decode_vectors_rgb32;
> +break;
> +case AV_PIX_FMT_RGB24:
> +s->decode_codebook = cinepak_decode_codebook_rgb24;
> +s->decode_vectors  = cinepak_decode_vectors_rgb24;
> +break;
> +case AV_PIX_FMT_RGB565:
> +s->decode_codebook = cinepak_decode_codebook_rgb565;
> +s->decode_vectors  = cinepak_decode_vectors_rgb565;
> +break;
> +case AV_PIX_FMT_YUV420P:
> +s->decode_codebook = cinepak_decode_codebook_yuv420;
> +s->decode_vectors  = cinepak_decode_vectors_yuv420;
> +break;
> +case AV_PIX_FMT_PAL8:
> +if (!s->palette_video) {
> +av_log(avctx, AV_LOG_ERROR, "Palettized output not supported 
> without palettized input\n");
> +return AVERROR(EINVAL);
> +}
> +s->decode_codebook = cinepak_decode_codebook_pal8;
> +s->decode_vectors  = cinepak_decode_vectors_pal8;
> +break;
> +default:

> +av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", 
> avctx->pix_fmt);

av_get_pix_fmt_name()


[...]

> @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = {
>  .close  = cinepak_decode_end,
>  .decode = cinepak_decode_frame,
>  .capabilities   = AV_CODEC_CAP_DR1,
> +.pix_fmts   = pixfmt_list,

This is possibly unneeded

thx

[...]

-- 
Michael GnuPG 

Re: [FFmpeg-devel] [PATCH] (for discussion): cuvid: allow to crop and resize in decoder

2017-02-13 Thread Michael Niedermayer
On Mon, Feb 13, 2017 at 12:43:51PM +0100, Hendrik Leppkes wrote:
> On Mon, Feb 13, 2017 at 11:36 AM, Timo Rothenpieler
>  wrote:
> > Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes:
> >> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň  
> >> wrote:
> >>> This patch is for discussion only, not ready to commit yet.
> >>>
> >>> 1. Cuvid decoder actualy support scaling input to requested resolution
> >>> without any performance penalty (like libnpp does), so this patch is proof
> >>> of concept that it is working like expected.
> >>>
> >>
> >> I don't think scaling is something a decoder should be doing, we don't
> >> really want all sorts of video processing jumbled up into one
> >> monolithic cuvid thing, but rather keep tasks separated.
> >
> > I'm generally in favor of adding this, but I don't see why ffmpeg.c
> > needs changes for this.
> > The decoder should already be free to return any video size it likes.
> >
> > CUVID is kind of a huge special case with its deinterlacing already,
> > cropping/resizing the output is quite trivial compared to that.
> >
> 
> We recently just had all sorts of discussions what decoders should and
> should not do, I don't think scaling in a decoder is a good thing to
> start doing here.

scaling in some decoders is mandated by some specs
some standards support reduced resolution which can switch from frame
to frame without the decoder output changing
There is also the possiblity of scalability where the reference stream
has lower resolution IIRC.

This is kind of different of course but, scaling code in decoders is
part of some specifications.

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

He who knows, does not speak. He who speaks, does not know. -- Lao Tsu


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] hlsenc: intialize only on ref_pkt and drop all packets

2017-02-13 Thread Michael Niedermayer
On Mon, Feb 13, 2017 at 08:56:19AM +0100, Miroslav Slugeň wrote:
> Dne 12.2.2017 v 23:35 Michael Niedermayer napsal(a):
> >On Sun, Feb 12, 2017 at 07:31:43PM +0100, Miroslav Slugeň wrote:
> >>This patch will fix cutting hls segments into exactly same length.
> >>Because it will intialize only on first ref_packet, which is video
> >>frame, not audio frame (old behavior)
> >>
> >>It will also drop all packets without PTS, drop all packets before
> >>initialization and drop all packets before first intialization
> >>packet PTS.
> >>
> >>Now it should be possible to create segments at exactly same length
> >>if we use new -force_key_frames hls:time_in_seconds parameter.
> >>
> >>This is required to support adaptive HLS.
> >>
> >>-- 
> >>Miroslav Slugeň
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>  hlsenc.c |   24 
> >>  1 file changed, 20 insertions(+), 4 deletions(-)
> >>7f784939c938c7697be2178647828a36815fc731  
> >>0001-hlsenc-intialize-only-on-ref_pkt-and-drop-all-packet.patch
> >> From a59a7dbe6fdcab64c1402adb8f11cc31400f4516 Mon Sep 17 00:00:00 2001
> >>From: Miroslav Slugen 
> >>Date: Sun, 12 Feb 2017 19:25:54 +0100
> >>Subject: [PATCH 1/1] hlsenc: intialize only on ref_pkt and drop all packets
> >>  before initialized
> >>
> >>---
> >>  libavformat/hlsenc.c | 24 
> >>  1 file changed, 20 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
> >>index ad5205a..226dd89 100644
> >>--- a/libavformat/hlsenc.c
> >>+++ b/libavformat/hlsenc.c
> >>@@ -1278,10 +1278,6 @@ static int hls_write_packet(AVFormatContext *s, 
> >>AVPacket *pkt)
> >>  oc = hls->avf;
> >>  stream_index = pkt->stream_index;
> >>  }
> >>-if (hls->start_pts == AV_NOPTS_VALUE) {
> >>-hls->start_pts = pkt->pts;
> >>-hls->end_pts   = pkt->pts;
> >>-}
> >>  if (hls->has_video) {
> >>  can_split = st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO &&
> >>@@ -1292,6 +1288,11 @@ static int hls_write_packet(AVFormatContext *s, 
> >>AVPacket *pkt)
> >>  is_ref_pkt = can_split = 0;
> >>  if (is_ref_pkt) {
> >>+if (hls->start_pts == AV_NOPTS_VALUE) {
> >>+hls->start_pts = pkt->pts;
> >>+hls->end_pts   = pkt->pts;
> >>+}
> >>+
> >>  if (hls->new_start) {
> >>  hls->new_start = 0;
> >>  hls->duration = (double)(pkt->pts - hls->end_pts)
> >>@@ -1371,6 +1372,21 @@ static int hls_write_packet(AVFormatContext *s, 
> >>AVPacket *pkt)
> >>  }
> >>  }
> >>+if (pkt->pts == AV_NOPTS_VALUE) {
> >>+av_log(s, AV_LOG_WARNING, "packet has no PTS, dropping packet from 
> >>stream: %d\n", pkt->stream_index);
> >>+return 0;
> >>+}
> >>+
> >>+if (hls->start_pts == AV_NOPTS_VALUE) {
> >>+av_log(s, AV_LOG_WARNING, "stream not initialized yet, dropping 
> >>packet from stream: %d\n", pkt->stream_index);
> >>+return 0;
> >>+}
> >>+
> >>+if (pkt->pts + pkt->duration <= hls->start_pts) {
> >>+av_log(s, AV_LOG_WARNING, "packet has PTS < START PTS (%"PRId64" < 
> >>%"PRId64"), dropping packet from stream: %d\n", pkt->pts, hls->start_pts, 
> >>pkt->stream_index);
> >>+return 0;
> >>+}
> >This triggers for subtitle streams, for example:
> >
> >./ffmpeg -i matrixbench_mpeg2.mpg -i 
> >fate-suite/sub/MovText_capability_tester.mp4  -f hls  -hls_segment_filename  
> >/tmp/file.%d.ts -t 10   /tmp/file.m3u8
> >
> >
> >[...]
> >
> >
> >
> >___
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> Otherwise patch is ok? Should i just check that stream is subtitles
> and add exception for it?

i did not review the patch, just tested and found the issue listed
above

I dont know in what other cases these conditions trigger, but
discarding packets in a muxer looks odd (which is why i tested it a
bit)

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct answer.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 3/4] x86util: import MOVHL macro

2017-02-13 Thread James Darnley
Originally committed to x264 in 1637239a by Henrik Gramner who has
agreed to re-license it as LGPL.  Original commit message follows.

x86: Avoid some bypass delays and false dependencies

A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning
between int and float domains, so try to avoid that if possible.
---
 libavutil/x86/x86util.asm | 12 
 1 file changed, 12 insertions(+)

diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm
index c063436e0a..1408f0a176 100644
--- a/libavutil/x86/x86util.asm
+++ b/libavutil/x86/x86util.asm
@@ -876,3 +876,15 @@
 psrlq   %1, 8*(%2)
 %endif
 %endmacro
+
+%macro MOVHL 2 ; dst, src
+%ifidn %1, %2
+punpckhqdq %1, %2
+%elif cpuflag(avx)
+punpckhqdq %1, %2, %2
+%elif cpuflag(sse4)
+pshufd %1, %2, q3232 ; pshufd is slow on some older CPUs, so only use 
it on more modern ones
+%else
+movhlps%1, %2; may cause an int/float domain transition and 
has a dependency on dst
+%endif
+%endmacro
-- 
2.11.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/4] avcodec/h264: add named parameters to x86 function

2017-02-13 Thread James Darnley
---
 libavcodec/x86/h264_deblock.asm | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 435c8be56f..509a0dbe0c 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -282,18 +282,18 @@ cextern pb_3
 ;int8_t *tc0)
 ;-
 %macro DEBLOCK_LUMA 0
-cglobal deblock_v_luma_8, 5,5,10
+cglobal deblock_v_luma_8, 5,5,10, pix_, stride_, alpha_, beta_, base3_
 movdm8, [r4] ; tc0
-lea r4, [r1*3]
-dec r2d; alpha-1
+lea r4, [stride_q*3]
+dec alpha_d; alpha-1
 neg r4
-dec r3d; beta-1
-add r4, r0 ; pix-3*stride
+dec beta_d; beta-1
+add base3_q, pix_q ; pix-3*stride
 
-movam0, [r4+r1]   ; p1
-movam1, [r4+2*r1] ; p0
-movam2, [r0]  ; q0
-movam3, [r0+r1]   ; q1
+movam0, [base3_q + stride_q]   ; p1
+movam1, [base3_q + 2*stride_q] ; p0
+movam2, [pix_q]  ; q0
+movam3, [pix_q + stride_q]   ; q1
 LOAD_MASK r2d, r3d
 
 punpcklbw m8, m8
@@ -303,24 +303,24 @@ cglobal deblock_v_luma_8, 5,5,10
 pandn   m9, m7
 pandm8, m9
 
-movdqa  m3, [r4] ; p2
+movdqa  m3, [base3_q] ; p2
 DIFF_GT2 m1, m3, m5, m6, m7 ; |p2-p0| > beta-1
 pandm6, m9
 psubb   m7, m8, m6
 pandm6, m8
-LUMA_Q1 m0, m3, [r4], [r4+r1], m6, m4
+LUMA_Q1 m0, m3, [base3_q], [base3_q + stride_q], m6, m4
 
-movdqa  m4, [r0+2*r1] ; q2
+movdqa  m4, [pix_q + 2*stride_q] ; q2
 DIFF_GT2 m2, m4, m5, m6, m3 ; |q2-q0| > beta-1
 pandm6, m9
 pandm8, m6
 psubb   m7, m6
-movam3, [r0+r1]
-LUMA_Q1 m3, m4, [r0+2*r1], [r0+r1], m8, m6
+movam3, [pix_q + stride_q]
+LUMA_Q1 m3, m4, [pix_q + 2*stride_q], [pix_q + stride_q], m8, m6
 
 DEBLOCK_P0_Q0
-mova[r4+2*r1], m1
-mova[r0], m2
+mova[base3_q + 2*stride_q], m1
+mova[pix_q], m2
 RET
 
 ;-
-- 
2.11.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/4] avcodec/x86: deduplicate PASS8ROWS macro

2017-02-13 Thread James Darnley
---
 libavcodec/x86/h264_deblock.asm   | 5 -
 libavcodec/x86/h264_deblock_10bit.asm | 5 -
 libavcodec/x86/hevc_deblock.asm   | 5 -
 libavutil/x86/x86util.asm | 5 +
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index fe0ab20266..435c8be56f 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -37,11 +37,6 @@ cextern pb_0
 cextern pb_1
 cextern pb_3
 
-; expands to [base],...,[base+7*stride]
-%define PASS8ROWS(base, base3, stride, stride3) \
-[base], [base+stride], [base+stride*2], [base3], \
-[base3+stride], [base3+stride*2], [base3+stride3], [base3+stride*4]
-
 %define PASS8ROWS(base, base3, stride, stride3, offset) \
 PASS8ROWS(base+offset, base3+offset, stride, stride3)
 
diff --git a/libavcodec/x86/h264_deblock_10bit.asm 
b/libavcodec/x86/h264_deblock_10bit.asm
index c2953640bb..1af3257a67 100644
--- a/libavcodec/x86/h264_deblock_10bit.asm
+++ b/libavcodec/x86/h264_deblock_10bit.asm
@@ -843,11 +843,6 @@ DEBLOCK_LUMA_INTRA
 mova [r0+2*r1], m2
 %endmacro
 
-; expands to [base],...,[base+7*stride]
-%define PASS8ROWS(base, base3, stride, stride3) \
-[base], [base+stride], [base+stride*2], [base3], \
-[base3+stride], [base3+stride*2], [base3+stride3], [base3+stride*4]
-
 ; in: 8 rows of 4 words in %4..%11
 ; out: 4 rows of 8 words in m0..m3
 %macro TRANSPOSE4x8W_LOAD 8
diff --git a/libavcodec/x86/hevc_deblock.asm b/libavcodec/x86/hevc_deblock.asm
index 48a597530b..85ee4800bb 100644
--- a/libavcodec/x86/hevc_deblock.asm
+++ b/libavcodec/x86/hevc_deblock.asm
@@ -39,11 +39,6 @@ cextern pw_m1
 SECTION .text
 INIT_XMM sse2
 
-; expands to [base],...,[base+7*stride]
-%define PASS8ROWS(base, base3, stride, stride3) \
-[base], [base+stride], [base+stride*2], [base3], \
-[base3+stride], [base3+stride*2], [base3+stride3], [base3+stride*4]
-
 ; in: 8 rows of 4 bytes in %4..%11
 ; out: 4 rows of 8 words in m0..m3
 %macro TRANSPOSE4x8B_LOAD 8
diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm
index 44ed750ae5..c063436e0a 100644
--- a/libavutil/x86/x86util.asm
+++ b/libavutil/x86/x86util.asm
@@ -29,6 +29,11 @@
 
 %include "libavutil/x86/x86inc.asm"
 
+; expands to [base],...,[base+7*stride]
+%define PASS8ROWS(base, base3, stride, stride3) \
+[base],   [base  + stride],   [base  + 2*stride], [base3], \
+[base3 + stride], [base3 + 2*stride], [base3 + stride3],  [base3 + 
stride*4]
+
 %macro SBUTTERFLY 4
 %ifidn %1, dqqq
 vperm2i128  m%4, m%2, m%3, q0301
-- 
2.11.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 4/4] avcodec/h264: sse2, avx h luma mbaff deblock/loop filter

2017-02-13 Thread James Darnley
x86-64 only

Yorkfield:
- sse2: 2.16x (434 vs. 201 cycles)

Skylake:
- sse2: 3.04x (378 vs. 124 cycles)
- avx:  3.29x (378 vs. 115 cycles)
---
 libavcodec/x86/h264_deblock.asm | 119 
 libavcodec/x86/h264dsp_init.c   |  10 
 2 files changed, 129 insertions(+)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 509a0dbe0c..f47a199e8f 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -377,10 +377,129 @@ cglobal deblock_h_luma_8, 5,9,0,0x60+16*WIN64
 RET
 %endmacro
 
+; TODO: use macro arguments
+%macro TRANSPOSE_8X8B_XMM 8
+punpcklbw m0, m1
+punpcklbw m2, m3
+punpcklbw m4, m5
+punpcklbw m6, m7
+
+punpckhwd m1, m0, m2
+punpcklwd m0, m2
+
+punpckhwd m5, m4, m6
+punpcklwd m4, m6
+
+punpckhdq m2, m0, m4
+punpckldq m0, m4
+
+punpckhdq m6, m1, m5
+punpckldq m1, m5
+
+MOVHL m4, m0
+MOVHL m3, m2
+MOVHL m7, m6
+MOVHL m5, m1
+SWAP 1, 4
+%endmacro
+
+%macro TRANSPOSE_8X8B_XMM 0
+TRANSPOSE_8X8B_XMM 0, 1, 2, 3, 4, 5, 6, 7
+%endmacro
+
+%macro DEBLOCK_H_LUMA_MBAFF 0
+
+cglobal deblock_h_luma_mbaff_8, 5, 9, 10, 8*16, pix_, stride_, alpha_, beta_, 
tc0_
+movsxd stride_q,  stride_d
+decalpha_d
+decbeta_d
+movr5,pix_q
+lear6,   [3*stride_q]
+addr5,r6
+
+movq m0, [pix_q - 4]
+movq m1, [pix_q + stride_q - 4]
+movq m2, [pix_q + 2*stride_q - 4]
+movq m3, [r5 - 4]
+movq m4, [r5 + stride_q - 4]
+movq m5, [r5 + 2*stride_q - 4]
+movq m6, [r5 + r6 - 4]
+movq m7, [r5 +4*stride_q - 4]
+
+TRANSPOSE_8X8B_XMM
+
+%assign i 0
+%rep 8
+movq [rsp + 16*i], m %+ i
+%assign i i+1
+%endrep
+
+; p2 = m1 [rsp + 16]
+; p1 = m2 [rsp + 32]
+; p0 = m3 [rsp + 48]
+; q0 = m4 [rsp + 64]
+; q1 = m5 [rsp + 80]
+; q2 = m6 [rsp + 96]
+
+SWAP 0, 2
+SWAP 1, 3
+SWAP 2, 4
+SWAP 3, 5
+
+LOAD_MASK alpha_d, beta_d
+movd m8, [tc0_q]
+punpcklbw m8, m8
+pcmpeqb m9, m9
+pcmpeqb m9, m8
+pandn   m9, m7
+pandm8, m9
+
+movdqa  m3, [rsp + 16] ; p2
+DIFF_GT2 m1, m3, m5, m6, m7 ; |p2-p0| > beta-1
+pandm6, m9
+psubb   m7, m8, m6
+pandm6, m8
+LUMA_Q1 m0, m3, [rsp + 16], [rsp + 32], m6, m4
+
+movdqa  m4, [rsp + 96] ; q2
+DIFF_GT2 m2, m4, m5, m6, m3 ; |q2-q0| > beta-1
+pandm6, m9
+pandm8, m6
+psubb   m7, m6
+movam3, [rsp + 80]
+LUMA_Q1 m3, m4, [rsp + 96], [rsp + 80], m8, m6
+
+DEBLOCK_P0_Q0
+SWAP 1, 3
+SWAP 2, 4
+movq m0, [rsp]
+movq m1, [rsp + 16]
+movq m2, [rsp + 32]
+movq m5, [rsp + 80]
+movq m6, [rsp + 96]
+movq m7, [rsp + 112]
+
+TRANSPOSE_8X8B_XMM
+movq [pix_q - 4], m0
+movq [pix_q + stride_q - 4], m1
+movq [pix_q + 2*stride_q - 4], m2
+movq [r5 - 4], m3
+movq [r5 + stride_q - 4], m4
+movq [r5 + 2*stride_q - 4], m5
+movq [r5 + r6 - 4], m6
+movq [r5 +4*stride_q - 4], m7
+
+RET
+
+%endmacro
+
 INIT_XMM sse2
+DEBLOCK_H_LUMA_MBAFF
 DEBLOCK_LUMA
+
 %if HAVE_AVX_EXTERNAL
 INIT_XMM avx
+DEBLOCK_H_LUMA_MBAFF
 DEBLOCK_LUMA
 %endif
 
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index 7b3d17f971..10f19401ef 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -137,6 +137,9 @@ LF_IFUNC(h, chroma422_intra, depth, avx)\
 LF_FUNC(v,  chroma,  depth, avx)\
 LF_IFUNC(v, chroma_intra,depth, avx)
 
+LF_FUNC(h, luma_mbaff, 8, sse2)
+LF_FUNC(h, luma_mbaff, 8, avx)
+
 LF_FUNCS(uint8_t,   8)
 LF_FUNCS(uint16_t, 10)
 
@@ -297,6 +300,10 @@ av_cold void ff_h264dsp_init_x86(H264DSPContext *c, const 
int bit_depth,
 c->h264_h_loop_filter_luma   = ff_deblock_h_luma_8_sse2;
 c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2;
 c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2;
+
+#if ARCH_X86_64
+c->h264_h_loop_filter_luma_mbaff = ff_deblock_h_luma_mbaff_8_sse2;
+#endif
 }
 if (EXTERNAL_SSSE3(cpu_flags)) {
 c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3;
@@ -307,6 +314,9 @@ av_cold void ff_h264dsp_init_x86(H264DSPContext *c, const 
int bit_depth,
 c->h264_h_loop_filter_luma   = ff_deblock_h_luma_8_avx;
 c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_avx;
 c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_avx;
+#if ARCH_X86_64
+c->h264_h_loop_filter_luma_mbaff = ff_deblock_h_luma_mbaff_8_avx;
+#endif
 }
 } else if (bit_depth == 10) {
 if (EXTERNAL_MMXEXT(cpu_flags)) {
-- 
2.11.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/http: Check for truncated buffers in http_connect()

2017-02-13 Thread Michael Niedermayer
Reported-by: SleepProgger 
Signed-off-by: Michael Niedermayer 
---
 libavformat/http.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/libavformat/http.c b/libavformat/http.c
index 944a6cf322..bd1be3f7bb 100644
--- a/libavformat/http.c
+++ b/libavformat/http.c
@@ -1011,6 +1011,7 @@ static int http_connect(URLContext *h, const char *path, 
const char *local_path,
 int len = 0;
 const char *method;
 int send_expect_100 = 0;
+int ret;
 
 /* send http header */
 post = h->flags & AVIO_FLAG_WRITE;
@@ -1107,7 +1108,7 @@ static int http_connect(URLContext *h, const char *path, 
const char *local_path,
 if (s->headers)
 av_strlcpy(headers + len, s->headers, sizeof(headers) - len);
 
-snprintf(s->buffer, sizeof(s->buffer),
+ret = snprintf(s->buffer, sizeof(s->buffer),
  "%s %s HTTP/1.1\r\n"
  "%s"
  "%s"
@@ -1123,6 +1124,14 @@ static int http_connect(URLContext *h, const char *path, 
const char *local_path,
 
 av_log(h, AV_LOG_DEBUG, "request: %s\n", s->buffer);
 
+if (strlen(headers) + 1 == sizeof(headers) ||
+ret >= sizeof(s->buffer)) {
+av_log(h, AV_LOG_ERROR, "overlong headers\n");
+err = AVERROR(EINVAL);
+goto done;
+}
+
+
 if ((err = ffurl_write(s->hd, s->buffer, strlen(s->buffer))) < 0)
 goto done;
 
-- 
2.11.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 0/9] Merge lazy filter initialization in ffmpeg CLI

2017-02-13 Thread Michael Niedermayer
On Mon, Feb 13, 2017 at 10:31:19AM +0100, wm4 wrote:
> On Fri, 10 Feb 2017 15:25:13 +0100
> Michael Niedermayer  wrote:
> 
> > On Fri, Feb 10, 2017 at 03:22:28PM +0100, Michael Niedermayer wrote:
> > > On Fri, Feb 10, 2017 at 03:15:29PM +0100, Michael Niedermayer wrote:  
> > > > On Fri, Feb 10, 2017 at 01:35:32PM +0100, wm4 wrote:  
> > > > > These patches merge the previously skipped Libav commits, which made
> > > > > avconv lazily initialize libavfilter graphs. This means the filters
> > > > > are initialized with the actual output format, instead of whatever
> > > > > libavformat reports.
> > > > > 
> > > > > It's a prerequisite to making hardware decoding support saner, as
> > > > > hardware decoders will output a different pixfmt than the software
> > > > > format reported by libavformat. This can be seen on ffmpeg_qsv.c,
> > > > > which doesn't lose any functionality, even though half of the code
> > > > > is removed.
> > > > > 
> > > > > There are some differences in how ffmpeg.c and avconv.c filter-flow
> > > > > works. Also, avconv.c doesn't have sub2video. Relatively intrusive
> > > > > changes were required.
> > > > > 
> > > > > The status of cuvid is unknown, but work in progress.
> > > > > 
> > > > > Anton Khirnov (4):
> > > > >   ffmpeg: do packet ts rescaling in write_packet()
> > > > >   ffmpeg: init filtergraphs only after we have a frame on each input
> > > > >   ffmpeg: move flushing the queued frames to configure_filtergraph()
> > > > >   ffmpeg: restructure sending EOF to filters
> > > > > 
> > > > > Timo Rothenpieler (3):
> > > > >   ffmpeg_cuvid: adapt for recent filter graph initialization changes
> > > > >   avcodec/cuvid: add format mismatch debug logs
> > > > >   avcodec/cuvid: update hw_frames_ctx reference after get_format call
> > > > > 
> > > > > wm4 (2):
> > > > >   ffmpeg: make sure packets put into the muxing FIFO are refcounted
> > > > >   ffmpeg: fix printing of filter input/output names  
> > > > 
> > > > This patchset breaks
> > > > ./ffmpeg -i Voting_Machine.wmv test.avi
> > > > 
> > > > http://data.onas.ru/fun-clips/Voting_Machine.wmv
> > > > 
> > > > didnt bisect which patch causes it  
> > > 
> > > heres another example:
> > > 
> > > ./ffmpeg -i ~/tickets/4329/bogus_video.mp4 -vframes 5  -vf crop=720:404  
> > > out.mov
> > > ./ffplay out.mov
> > > before this patchset out.mov had an audio stream  
> > 
> > sample seems to be here:
> > http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket4329/
> > 
> > [...]
> > 
> > 
> 
> Most of these should be fixed, new patches:
> https://github.com/wm4/FFmpeg/commits/filter-merge

already reported on IRC:
this breaks:
./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -vf scale=80x60  small.mpg && 
./ffmpeg -i small.mpg  -vframes 3 -metadata compilation="1"  blah.m4a


Also please repost the patchset or changed patches to the ML
I think this needs more testing, its a large patchset

Thanks

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I am the wisest man alive, for I know one thing, and that is that I know
nothing. -- Socrates


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): cuvid: allow to crop and resize in decoder

2017-02-13 Thread Hendrik Leppkes
On Mon, Feb 13, 2017 at 11:36 AM, Timo Rothenpieler
 wrote:
> Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes:
>> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň  wrote:
>>> This patch is for discussion only, not ready to commit yet.
>>>
>>> 1. Cuvid decoder actualy support scaling input to requested resolution
>>> without any performance penalty (like libnpp does), so this patch is proof
>>> of concept that it is working like expected.
>>>
>>
>> I don't think scaling is something a decoder should be doing, we don't
>> really want all sorts of video processing jumbled up into one
>> monolithic cuvid thing, but rather keep tasks separated.
>
> I'm generally in favor of adding this, but I don't see why ffmpeg.c
> needs changes for this.
> The decoder should already be free to return any video size it likes.
>
> CUVID is kind of a huge special case with its deinterlacing already,
> cropping/resizing the output is quite trivial compared to that.
>

We recently just had all sorts of discussions what decoders should and
should not do, I don't think scaling in a decoder is a good thing to
start doing here.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 3/4] aacsbr: Associate SBR data with AAC elements on init

2017-02-13 Thread Carl Eugen Hoyos
2017-02-13 6:37 GMT+01:00 Alex Converse :
> On Thu, Feb 9, 2017 at 4:11 PM, Carl Eugen Hoyos  wrote:
>>
>> 2017-02-09 18:40 GMT+01:00 Alex Converse :
>> > Quiets some log spam on pure upsampling mode.
>>
>> Please mention ticket #5163.
>
> Done

Thank you.

>> For the whole patchset, I suggest you push as soon as everybody
>> agrees on the function prefixes.
>
> Prefix changed. Patches 2 and 4 don't have any comments. Do
> they need further review by anyone?

Imo, you have waited long enough.

Thanks for the review and the patches, Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/2] lavc, lavf, lavu: remove AVOption requirement to access public fields

2017-02-13 Thread wm4
Allow all struct fields to be accessed directly, as long as they're
public.

Before this change, many fields were "public", but could be accessed via
AVOption only. This meant they were effectively not public, but were
present for documentation purposes, which was incredibly confusing at
best.
---
 libavcodec/avcodec.h   | 25 +
 libavformat/avformat.h | 50 ++
 libavutil/frame.h  | 32 +++-
 3 files changed, 34 insertions(+), 73 deletions(-)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 3e161ea10e..6a64df17e2 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -1679,7 +1679,7 @@ enum AVFieldOrder {
  * New fields can be added to the end with minor version bumps.
  * Removal, reordering and changes to existing fields require a major
  * version bump.
- * Please use AVOptions (av_opt* / av_set/get*()) to access these fields from 
user
+ * You can use AVOptions (av_opt* / av_set/get*()) to access these fields from 
user
  * applications.
  * The name string for AVOptions options matches the associated command line
  * parameter name and can be found in libavcodec/options_table.h
@@ -2950,8 +2950,8 @@ typedef struct AVCodecContext {
 #define FF_DEBUG_MMCO0x0800
 #define FF_DEBUG_BUGS0x1000
 #if FF_API_DEBUG_MV
-#define FF_DEBUG_VIS_QP  0x2000 ///< only access through AVOptions 
from outside libavcodec
-#define FF_DEBUG_VIS_MB_TYPE 0x4000 ///< only access through AVOptions 
from outside libavcodec
+#define FF_DEBUG_VIS_QP  0x2000
+#define FF_DEBUG_VIS_MB_TYPE 0x4000
 #endif
 #define FF_DEBUG_BUFFERS 0x8000
 #define FF_DEBUG_THREADS 0x0001
@@ -2961,7 +2961,6 @@ typedef struct AVCodecContext {
 #if FF_API_DEBUG_MV
 /**
  * debug
- * Code outside libavcodec should access this field using AVOptions
  * - encoding: Set by user.
  * - decoding: Set by user.
  */
@@ -3096,8 +3095,6 @@ typedef struct AVCodecContext {
  * low resolution decoding, 1-> 1/2 size, 2->1/4 size
  * - encoding: unused
  * - decoding: Set by user.
- * Code outside libavcodec should access this field using:
- * av_codec_{get,set}_lowres(avctx)
  */
  int lowres;
 #endif
@@ -3398,8 +3395,6 @@ typedef struct AVCodecContext {
 
 /**
  * Timebase in which pkt_dts/pts and AVPacket.dts/pts are.
- * Code outside libavcodec should access this field using:
- * av_codec_{get,set}_pkt_timebase(avctx)
  * - encoding unused.
  * - decoding set by user.
  */
@@ -3407,8 +3402,6 @@ typedef struct AVCodecContext {
 
 /**
  * AVCodecDescriptor
- * Code outside libavcodec should access this field using:
- * av_codec_{get,set}_codec_descriptor(avctx)
  * - encoding: unused.
  * - decoding: set by libavcodec.
  */
@@ -3419,8 +3412,6 @@ typedef struct AVCodecContext {
  * low resolution decoding, 1-> 1/2 size, 2->1/4 size
  * - encoding: unused
  * - decoding: Set by user.
- * Code outside libavcodec should access this field using:
- * av_codec_{get,set}_lowres(avctx)
  */
  int lowres;
 #endif
@@ -3461,7 +3452,6 @@ typedef struct AVCodecContext {
  * However for formats that do not use pre-multiplied alpha
  * there might be serious artefacts (though e.g. libswscale currently
  * assumes pre-multiplied alpha anyway).
- * Code outside libavcodec should access this field using AVOptions
  *
  * - decoding: set by user
  * - encoding: unused
@@ -3478,7 +3468,6 @@ typedef struct AVCodecContext {
 #if !FF_API_DEBUG_MV
 /**
  * debug motion vectors
- * Code outside libavcodec should access this field using AVOptions
  * - encoding: Set by user.
  * - decoding: Set by user.
  */
@@ -3490,7 +3479,6 @@ typedef struct AVCodecContext {
 
 /**
  * custom intra quantization matrix
- * Code outside libavcodec should access this field using 
av_codec_g/set_chroma_intra_matrix()
  * - encoding: Set by user, can be NULL.
  * - decoding: unused.
  */
@@ -3499,8 +3487,6 @@ typedef struct AVCodecContext {
 /**
  * dump format separator.
  * can be ", " or "\n  " or anything else
- * Code outside libavcodec should access this field using AVOptions
- * (NO direct access).
  * - encoding: Set by user.
  * - decoding: Set by user.
  */
@@ -3510,13 +3496,12 @@ typedef struct AVCodecContext {
  * ',' separated list of allowed decoders.
  * If NULL then all are allowed
  * - encoding: unused
- * - decoding: set by user through AVOPtions (NO direct access)
+ * - decoding: set by user
  */
 char *codec_whitelist;
 
 /*
  * Properties of the stream that gets decoded
- * To be accessed through av_codec_get_properties() (NO direct access)
  * - encoding: unused
  * - decoding: set by libavcodec
  */
@@ -3645,7 

[FFmpeg-devel] [PATCH 2/2] lavf: fix AVStream private fields marker

2017-02-13 Thread wm4
Public fields were added after the private fields (negating the entire
point of this). New private fields go into AVStreamInternal anyway.

The new marker was set by guessing which fields are supposed to be
private and wshich not. recommended_encoder_configuration is accessed by
ffserver_config.c directly, and is supposed to use the public API.

ffmpeg.c accesses AVStream.cur_dts, even though it's a private field,
but that seems to be an older error.
---
 libavformat/avformat.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavformat/avformat.h b/libavformat/avformat.h
index 64180bca9e..4c1b18e002 100644
--- a/libavformat/avformat.h
+++ b/libavformat/avformat.h
@@ -1005,7 +1005,9 @@ typedef struct AVStream {
  * All fields below this line are not part of the public API. They
  * may not be used outside of libavformat and can be changed and
  * removed at will.
- * New public fields should be added right above.
+ * Internal note: be aware that physically removing these fields
+ * will break ABI. Replace removed fields with dummy fields, and
+ * add new fields to AVStreamInternal.
  *
  */
 
@@ -1201,6 +1203,12 @@ typedef struct AVStream {
  */
 int inject_global_side_data;
 
+/*
+ * All fields above this line are not part of the public API.
+ * Fields below are part of the public API and ABI again.
+ *
+ */
+
 /**
  * String containing paris of key and values describing recommended 
encoder configuration.
  * Paris are separated by ','.
-- 
2.11.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): cuvid: allow to crop and resize in decoder

2017-02-13 Thread Timo Rothenpieler
Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes:
> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň  wrote:
>> This patch is for discussion only, not ready to commit yet.
>>
>> 1. Cuvid decoder actualy support scaling input to requested resolution
>> without any performance penalty (like libnpp does), so this patch is proof
>> of concept that it is working like expected.
>>
> 
> I don't think scaling is something a decoder should be doing, we don't
> really want all sorts of video processing jumbled up into one
> monolithic cuvid thing, but rather keep tasks separated.

I'm generally in favor of adding this, but I don't see why ffmpeg.c
needs changes for this.
The decoder should already be free to return any video size it likes.

CUVID is kind of a huge special case with its deinterlacing already,
cropping/resizing the output is quite trivial compared to that.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg_filter: initialize cuvid for filter_complex

2017-02-13 Thread Timo Rothenpieler
>> That's what it looks like for me:
>> https://bpaste.net/show/890855410dac
>>
>> Happens on two independend machines, on both Windows using MSVC and
>> Linux with gcc.
>> Both machines are definitely nowehre near out of memory, on either
>> system or device memory.
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> Can't reproduce it on two my systems with same sample and same command
> line.
> 
> First 1000 lines:
> 375.26: https://bpaste.net/show/bed97b3e0287
> 378.09: https://bpaste.net/show/912c042036cd
> 
> Configuration1:
> Debian Jessie Linux desktop 4.8.0-0.bpo.2-amd64 #1 SMP Debian
> 4.8.15-2~bpo8+2 (2017-01-17) x86_64 GNU/Linux
> GeForce GTX 1060, drivers 375.26
> 
> Configuration2:
> Debian Jessie Linux pascal 4.7.0-0.bpo.1-amd64 #1 SMP Debian
> 4.7.8-1~bpo8+1 (2016-10-19) x86_64 GNU/Linux
> GeForce GTX 1080, drivers 378.09

That's not built from the right branch.
Most notably: On the filter-merge branch, the cuvid pfnSequenceCallback
happens before the "Nvenc initialized successfully", on your log Nvenc
still gets initialized first.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 0/9] Merge lazy filter initialization in ffmpeg CLI

2017-02-13 Thread wm4
On Fri, 10 Feb 2017 15:25:13 +0100
Michael Niedermayer  wrote:

> On Fri, Feb 10, 2017 at 03:22:28PM +0100, Michael Niedermayer wrote:
> > On Fri, Feb 10, 2017 at 03:15:29PM +0100, Michael Niedermayer wrote:  
> > > On Fri, Feb 10, 2017 at 01:35:32PM +0100, wm4 wrote:  
> > > > These patches merge the previously skipped Libav commits, which made
> > > > avconv lazily initialize libavfilter graphs. This means the filters
> > > > are initialized with the actual output format, instead of whatever
> > > > libavformat reports.
> > > > 
> > > > It's a prerequisite to making hardware decoding support saner, as
> > > > hardware decoders will output a different pixfmt than the software
> > > > format reported by libavformat. This can be seen on ffmpeg_qsv.c,
> > > > which doesn't lose any functionality, even though half of the code
> > > > is removed.
> > > > 
> > > > There are some differences in how ffmpeg.c and avconv.c filter-flow
> > > > works. Also, avconv.c doesn't have sub2video. Relatively intrusive
> > > > changes were required.
> > > > 
> > > > The status of cuvid is unknown, but work in progress.
> > > > 
> > > > Anton Khirnov (4):
> > > >   ffmpeg: do packet ts rescaling in write_packet()
> > > >   ffmpeg: init filtergraphs only after we have a frame on each input
> > > >   ffmpeg: move flushing the queued frames to configure_filtergraph()
> > > >   ffmpeg: restructure sending EOF to filters
> > > > 
> > > > Timo Rothenpieler (3):
> > > >   ffmpeg_cuvid: adapt for recent filter graph initialization changes
> > > >   avcodec/cuvid: add format mismatch debug logs
> > > >   avcodec/cuvid: update hw_frames_ctx reference after get_format call
> > > > 
> > > > wm4 (2):
> > > >   ffmpeg: make sure packets put into the muxing FIFO are refcounted
> > > >   ffmpeg: fix printing of filter input/output names  
> > > 
> > > This patchset breaks
> > > ./ffmpeg -i Voting_Machine.wmv test.avi
> > > 
> > > http://data.onas.ru/fun-clips/Voting_Machine.wmv
> > > 
> > > didnt bisect which patch causes it  
> > 
> > heres another example:
> > 
> > ./ffmpeg -i ~/tickets/4329/bogus_video.mp4 -vframes 5  -vf crop=720:404  
> > out.mov
> > ./ffplay out.mov
> > before this patchset out.mov had an audio stream  
> 
> sample seems to be here:
> http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket4329/
> 
> [...]
> 
> 

Most of these should be fixed, new patches:
https://github.com/wm4/FFmpeg/commits/filter-merge
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg_filter: initialize cuvid for filter_complex

2017-02-13 Thread Miroslav Slugeň

Dne 12.2.2017 v 23:27 Timo Rothenpieler napsal(a):

I just tried your build with this cmd line:

ffmpeg -hwaccel cuvid -c:v h264_cuvid -i simpson_1920p_h264.mp4 -y -c:v
hevc_nvenc -an -b:v 512K -qmin 5 -qmax 50 -preset slow
out_1920p_1920p_hq.mp4

And everything works well, do you have not working example?

I have GTX 1060 3GB with current stable drivers.

M.

That's what it looks like for me:
https://bpaste.net/show/890855410dac

Happens on two independend machines, on both Windows using MSVC and
Linux with gcc.
Both machines are definitely nowehre near out of memory, on either
system or device memory.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Can't reproduce it on two my systems with same sample and same command line.

First 1000 lines:
375.26: https://bpaste.net/show/bed97b3e0287
378.09: https://bpaste.net/show/912c042036cd

Configuration1:
Debian Jessie Linux desktop 4.8.0-0.bpo.2-amd64 #1 SMP Debian 
4.8.15-2~bpo8+2 (2017-01-17) x86_64 GNU/Linux

GeForce GTX 1060, drivers 375.26

Configuration2:
Debian Jessie Linux pascal 4.7.0-0.bpo.1-amd64 #1 SMP Debian 
4.7.8-1~bpo8+1 (2016-10-19) x86_64 GNU/Linux

GeForce GTX 1080, drivers 378.09

M.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg: prefer cuvid decoders when use option -cuvid

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 09:11 Hendrik Leppkes napsal(a):

On Mon, Feb 13, 2017 at 9:08 AM, Miroslav Slugeň  wrote:

Problem is that when you are using for example 200 input streams you have to
always specify correct input format h264/mpeg2/nvenc. Also when you are
using -hwaccel cuvid you have to specify it too, otherwise there is error:
CUVID hwaccel requested, but impossible to achieve.


You can just script that and then its not really that huge of an extra
effort to specify it. The patch is beyond terrible, and I don't see
any good way to implement something like this either without some
thorough API design to be able to select decoders in a smart way..

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

I know it was 1 hour hack :)

Correct behavior should be:

1. when you use "-hwaccel cuvid" ffmpeg should try cuvid decoders first
2. alias in decoders "-c:v cuvid" and then ffmpeg should try only cuvid 
decoders


This should be same for other hwaccels.

M.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg: prefer cuvid decoders when use option -cuvid

2017-02-13 Thread Hendrik Leppkes
On Mon, Feb 13, 2017 at 9:08 AM, Miroslav Slugeň  wrote:
> Problem is that when you are using for example 200 input streams you have to
> always specify correct input format h264/mpeg2/nvenc. Also when you are
> using -hwaccel cuvid you have to specify it too, otherwise there is error:
> CUVID hwaccel requested, but impossible to achieve.
>

You can just script that and then its not really that huge of an extra
effort to specify it. The patch is beyond terrible, and I don't see
any good way to implement something like this either without some
thorough API design to be able to select decoders in a smart way..

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): cuvid: allow to crop and resize in decoder

2017-02-13 Thread wm4
On Mon, 13 Feb 2017 09:03:09 +0100
Miroslav Slugeň  wrote:

> Dne 13.2.2017 v 05:03 wm4 napsal(a):
> > On Sun, 12 Feb 2017 21:07:40 +0100
> > Miroslav Slugeň  wrote:
> >  
> >> Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a):  
> >>> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň  
> >>> wrote:  
>  This patch is for discussion only, not ready to commit yet.
> 
>  1. Cuvid decoder actualy support scaling input to requested resolution
>  without any performance penalty (like libnpp does), so this patch is 
>  proof
>  of concept that it is working like expected.
>  
> >>> I don't think scaling is something a decoder should be doing, we don't
> >>> really want all sorts of video processing jumbled up into one
> >>> monolithic cuvid thing, but rather keep tasks separated.
> >>>
> >>> - Hendrik
> >>> ___
> >>> ffmpeg-devel mailing list
> >>> ffmpeg-devel@ffmpeg.org
> >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel  
> >> Yes, but when you transcoding from FHD or 4K to SD quality it could save
> >> alotof GPU resources.
> >>
> >> We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling
> >> about 74 FHD streams to SD at realtime.
> >>
> >> I know it is not something that is acceptable in current ffmpeg, maybe
> >> libav could adopt this patch.  
> > You mean the Libav project? They'd be even less likely to accept such a
> > patch.
> >
> > Anyway, I don't think this would be slower than doing it in some sort
> > of separate cuda video filter.
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel  
> This is not true, NVDEC (cuvid) is separate chipset and has its own 
> NVDEC load in nvidia-smi monitoring tool, while resizing with libnpp is 
> completly done on CUDA cores. In NVDEC only deinterlacing ADAPTIVE is 
> using CUDA cores more intensively, cropping and resizing in NVDEC is for 
> free :)

I wasn't talking about libnpp. I'm assuming they provide their
processing stuff as separate APIs somewhere.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): ffmpeg: prefer cuvid decoders when use option -cuvid

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 05:08 wm4 napsal(a):

On Sun, 12 Feb 2017 21:20:12 +
Mark Thompson  wrote:


On 12/02/17 20:37, Miroslav Slugeň wrote:

This patch is for discussion only, not ready to commit yet and maybe newer will 
be.

We were facing issue when using -hwaccel cuvid we have to also specify input 
decoder like -c:v _cuvid for every input and input video format was 
sometimes mpeg2/h264/hevc. So this is my FIX/HACK to only specify -cuvid and 
ffmpeg will pick cuvid decoder for any supported input.

I don't know correct solution for this yet.

Adding global variables to libraries to mess with their internals is not an 
acceptable solution to anything.

The correct solution to this problem is to write a real cuvid hwaccel, which 
works within the existing decoder to offer decoding of streams which it 
supports without changing the behaviour at all in normal software cases 
(compare the behaviour of cuvid (standalone decoder) with dxva2, vaapi or vdpau 
(full hwaccels inside the normal decoder)).

An alternative solution for your specific case would be to disable the normal 
H.264, MPEG-2, etc. decoders in your build, such that the cuvid decoder appears 
first in the list and would always be picked for any given stream.  (This of 
course would also remove support for the wider set of streams which the 
libavcodec decoders support, such as H.264 at higher big depths, though given 
that your patch here also has that effect I assume you aren't particularly 
concerned about that case.)

What's the problem with just specifying the correct decoder? Both API
and ffmpeg.c allow doing this.

Although what I don't like is that API users need to do not-so-clean
things to find the right decoder, e.g. relying on the naming
conventions, and assuming you can get the the cuda wrapper by
concatenating the codec and API names e.g. h264 -> "h264_cuda".
Maybe adding decoder and API fields to AVHWAccel would be nice.
Then ffmpeg.c could get an option to select codecs by API too.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Problem is that when you are using for example 200 input streams you 
have to always specify correct input format h264/mpeg2/nvenc. Also when 
you are using -hwaccel cuvid you have to specify it too, otherwise there 
is error: CUVID hwaccel requested, but impossible to achieve.


M.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] (for discussion): cuvid: allow to crop and resize in decoder

2017-02-13 Thread Miroslav Slugeň

Dne 13.2.2017 v 05:03 wm4 napsal(a):

On Sun, 12 Feb 2017 21:07:40 +0100
Miroslav Slugeň  wrote:


Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a):

On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň  wrote:

This patch is for discussion only, not ready to commit yet.

1. Cuvid decoder actualy support scaling input to requested resolution
without any performance penalty (like libnpp does), so this patch is proof
of concept that it is working like expected.
  

I don't think scaling is something a decoder should be doing, we don't
really want all sorts of video processing jumbled up into one
monolithic cuvid thing, but rather keep tasks separated.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Yes, but when you transcoding from FHD or 4K to SD quality it could save
alotof GPU resources.

We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling
about 74 FHD streams to SD at realtime.

I know it is not something that is acceptable in current ffmpeg, maybe
libav could adopt this patch.

You mean the Libav project? They'd be even less likely to accept such a
patch.

Anyway, I don't think this would be slower than doing it in some sort
of separate cuda video filter.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This is not true, NVDEC (cuvid) is separate chipset and has its own 
NVDEC load in nvidia-smi monitoring tool, while resizing with libnpp is 
completly done on CUDA cores. In NVDEC only deinterlacing ADAPTIVE is 
using CUDA cores more intensively, cropping and resizing in NVDEC is for 
free :)


M.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel