Re: [FFmpeg-devel] [PATCH 2/2] libavformat/mov: fix udta reading in trak box

2022-02-07 Thread Wang Chuan

Any news?

On 2022/2/4 19:10, Jan Ekström wrote:

On Fri, Feb 4, 2022 at 5:24 AM Wang Chuan  wrote:

Ping?
On Jan 28, 2022, 11:24 AM +0800, Wang Chuan , wrote:

if we are reading udta in trak box, the data should go to metadata
of current stream.

Signed-off-by: Wang Chuan 
---
libavformat/mov.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index 1437d160f8..cb983defb3 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -522,7 +522,10 @@ retry:
str[str_size] = 0;
}
c->fc->event_flags |= AVFMT_EVENT_FLAG_METADATA_UPDATED;
- av_dict_set(>fc->metadata, key, str, 0);
+ if (c->trak_index != -1)
+ av_dict_set(>fc->streams[c->trak_index]->metadata, key,
str, 0);
+ else
+ av_dict_set(>fc->metadata, key, str, 0);
if (*language && strcmp(language, "und")) {
snprintf(key2, sizeof(key2), "%s-%s", key, language);
av_dict_set(>fc->metadata, key2, str, 0);
--
2.29.2

I recall having some patches on my github regarding something related,
will attempt to check this during the week-end.

Jan
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] lavu/fifo: fix regression

2022-02-07 Thread Xiang, Haihao
From: Haihao Xiang 

offset_w might be updated after growing the FIFO

Fix ticket #9630

Tested-by: U. Artie Eoff 
Reviewed-by: mkver
Reviewed-by: U. Artie Eoff 
Signed-off-by: Haihao Xiang 
---
 libavutil/fifo.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavutil/fifo.c b/libavutil/fifo.c
index 0af0154945..02e0ec3f0d 100644
--- a/libavutil/fifo.c
+++ b/libavutil/fifo.c
@@ -147,13 +147,15 @@ static int fifo_write_common(AVFifo *f, const uint8_t 
*buf, size_t *nb_elems,
  AVFifoCB read_cb, void *opaque)
 {
 size_t to_write = *nb_elems;
-size_t offset_w = f->offset_w;
+size_t offset_w;
 int ret = 0;
 
 ret = fifo_check_space(f, to_write);
 if (ret < 0)
 return ret;
 
+offset_w = f->offset_w;
+
 while (to_write > 0) {
 size_tlen = FFMIN(f->nb_elems - offset_w, to_write);
 uint8_t *wptr = f->buffer + offset_w * f->elem_size;
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH V3 3/3] libavcodec/vaapi_encode: Add async_depth to vaapi_encoder to increase performance

2022-02-07 Thread Wenbin Chen
Add async_depth to increase encoder's performance. Reuse encode_fifo as
async buffer. Encoder puts all reordered frame to HW and then check
fifo size. If fifo < async_depth and the top frame is not ready, it will
return AVERROR(EAGAIN) to require more frames.

1080p transcoding (no B frames) with -async_depth=4 can increase 20%
performance on my environment.
The async increases performance but also introduces frame delay.

Signed-off-by: Wenbin Chen 
---
 libavcodec/vaapi_encode.c | 16 
 libavcodec/vaapi_encode.h | 12 ++--
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index 15ddbbaa4a..432abf31f7 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -1158,7 +1158,8 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, 
AVFrame *frame)
 if (ctx->input_order == ctx->decode_delay)
 ctx->dts_pts_diff = pic->pts - ctx->first_pts;
 if (ctx->output_delay > 0)
-ctx->ts_ring[ctx->input_order % (3 * ctx->output_delay)] = 
pic->pts;
+ctx->ts_ring[ctx->input_order %
+(3 * ctx->output_delay + ctx->async_depth)] = pic->pts;
 
 pic->display_order = ctx->input_order;
 ++ctx->input_order;
@@ -1214,7 +1215,7 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, 
AVPacket *pkt)
 
 #if VA_CHECK_VERSION(1, 9, 0)
 if (ctx->has_sync_buffer_func) {
-while (av_fifo_can_read(ctx->encode_fifo) <= MAX_PICTURE_REFERENCES) {
+while (av_fifo_can_read(ctx->encode_fifo) <= MAX_ASYNC_DEPTH) {
 pic = NULL;
 err = vaapi_encode_pick_next(avctx, );
 if (err < 0)
@@ -1232,6 +1233,13 @@ int ff_vaapi_encode_receive_packet(AVCodecContext 
*avctx, AVPacket *pkt)
 }
 if (!av_fifo_can_read(ctx->encode_fifo))
 return err;
+if (av_fifo_can_read(ctx->encode_fifo) < ctx->async_depth &&
+!ctx->end_of_stream) {
+av_fifo_peek(ctx->encode_fifo, , 1, 0);
+err = vaapi_encode_wait(avctx, pic, 0);
+if (err < 0)
+return err;
+}
 av_fifo_read(ctx->encode_fifo, , 1);
 ctx->encode_order = pic->encode_order + 1;
 } else
@@ -1267,7 +1275,7 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, 
AVPacket *pkt)
 pkt->dts = ctx->ts_ring[pic->encode_order] - ctx->dts_pts_diff;
 } else {
 pkt->dts = ctx->ts_ring[(pic->encode_order - ctx->decode_delay) %
-(3 * ctx->output_delay)];
+(3 * ctx->output_delay + ctx->async_depth)];
 }
 av_log(avctx, AV_LOG_DEBUG, "Output packet: pts %"PRId64" dts 
%"PRId64".\n",
pkt->pts, pkt->dts);
@@ -2588,7 +2596,7 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx)
 vas = vaSyncBuffer(ctx->hwctx->display, 0, 0);
 if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) {
 ctx->has_sync_buffer_func = 1;
-ctx->encode_fifo = av_fifo_alloc2(MAX_PICTURE_REFERENCES + 1,
+ctx->encode_fifo = av_fifo_alloc2(MAX_ASYNC_DEPTH,
   sizeof(VAAPIEncodePicture *),
   0);
 if (!ctx->encode_fifo)
diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h
index d33a486cb8..691521387d 100644
--- a/libavcodec/vaapi_encode.h
+++ b/libavcodec/vaapi_encode.h
@@ -48,6 +48,7 @@ enum {
 MAX_TILE_ROWS  = 22,
 // A.4.1: table A.6 allows at most 20 tile columns for any level.
 MAX_TILE_COLS  = 20,
+MAX_ASYNC_DEPTH= 64,
 };
 
 extern const AVCodecHWConfigInternal *const ff_vaapi_encode_hw_configs[];
@@ -298,7 +299,8 @@ typedef struct VAAPIEncodeContext {
 // Timestamp handling.
 int64_t first_pts;
 int64_t dts_pts_diff;
-int64_t ts_ring[MAX_REORDER_DELAY * 3];
+int64_t ts_ring[MAX_REORDER_DELAY * 3 +
+MAX_ASYNC_DEPTH];
 
 // Slice structure.
 int slice_block_rows;
@@ -350,6 +352,8 @@ typedef struct VAAPIEncodeContext {
 AVFifo *encode_fifo;
 //Whether the driver support vaSyncBuffer
 int has_sync_buffer_func;
+//Max number of frame buffered in encoder.
+int async_depth;
 } VAAPIEncodeContext;
 
 enum {
@@ -460,7 +464,11 @@ int ff_vaapi_encode_close(AVCodecContext *avctx);
 { "b_depth", \
   "Maximum B-frame reference depth", \
   OFFSET(common.desired_b_depth), AV_OPT_TYPE_INT, \
-  { .i64 = 1 }, 1, INT_MAX, FLAGS }
+  { .i64 = 1 }, 1, INT_MAX, FLAGS }, \
+{ "async_depth", "Maximum processing parallelism. " \
+  "Increase this to improve single channel performance", \
+  OFFSET(common.async_depth), AV_OPT_TYPE_INT, \
+  { .i64 = 4 }, 0, MAX_ASYNC_DEPTH, FLAGS }
 
 #define VAAPI_ENCODE_RC_MODE(name, desc) \
 { #name, desc, 0, AV_OPT_TYPE_CONST, { .i64 = 

[FFmpeg-devel] [PATCH V3 2/3] libavcodec/vaapi_encode: Change the way to call async to increase performance

2022-02-07 Thread Wenbin Chen
Fix: #7706. After commit 5fdcf85bbffe7451c2, vaapi encoder's performance
decrease. The reason is that vaRenderPicture() and vaSyncBuffer() are
called at the same time (vaRenderPicture() always followed by a
vaSyncBuffer()). When we encode stream with B frames, we need buffer to
reorder frames, so we can send serveral frames to HW at once to increase
performance. Now I changed them to be called in a asynchronous way, which
will make better use of hardware. 1080p transcoding increases about 17%
fps on my environment.

This change fits vaSyncBuffer(), so if driver does not support
vaSyncBuffer, it will keep previous operation.

Signed-off-by: Wenbin Chen 
---
 libavcodec/vaapi_encode.c | 64 ---
 libavcodec/vaapi_encode.h |  5 +++
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index b87b58a42b..15ddbbaa4a 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -984,8 +984,10 @@ static int vaapi_encode_pick_next(AVCodecContext *avctx,
 if (!pic && ctx->end_of_stream) {
 --b_counter;
 pic = ctx->pic_end;
-if (pic->encode_issued)
+if (pic->encode_complete)
 return AVERROR_EOF;
+else if (pic->encode_issued)
+return AVERROR(EAGAIN);
 }
 
 if (!pic) {
@@ -1210,18 +1212,44 @@ int ff_vaapi_encode_receive_packet(AVCodecContext 
*avctx, AVPacket *pkt)
 return AVERROR(EAGAIN);
 }
 
-pic = NULL;
-err = vaapi_encode_pick_next(avctx, );
-if (err < 0)
-return err;
-av_assert0(pic);
+#if VA_CHECK_VERSION(1, 9, 0)
+if (ctx->has_sync_buffer_func) {
+while (av_fifo_can_read(ctx->encode_fifo) <= MAX_PICTURE_REFERENCES) {
+pic = NULL;
+err = vaapi_encode_pick_next(avctx, );
+if (err < 0)
+break;
+
+av_assert0(pic);
+pic->encode_order = ctx->encode_order +
+av_fifo_can_read(ctx->encode_fifo);
+err = vaapi_encode_issue(avctx, pic);
+if (err < 0) {
+av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err);
+return err;
+}
+av_fifo_write(ctx->encode_fifo, , 1);
+}
+if (!av_fifo_can_read(ctx->encode_fifo))
+return err;
+av_fifo_read(ctx->encode_fifo, , 1);
+ctx->encode_order = pic->encode_order + 1;
+} else
+#endif
+{
+pic = NULL;
+err = vaapi_encode_pick_next(avctx, );
+if (err < 0)
+return err;
+av_assert0(pic);
 
-pic->encode_order = ctx->encode_order++;
+pic->encode_order = ctx->encode_order++;
 
-err = vaapi_encode_issue(avctx, pic);
-if (err < 0) {
-av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err);
-return err;
+err = vaapi_encode_issue(avctx, pic);
+if (err < 0) {
+av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err);
+return err;
+}
 }
 
 err = vaapi_encode_output(avctx, pic, pkt);
@@ -2555,6 +2583,19 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx)
 }
 }
 
+#if VA_CHECK_VERSION(1, 9, 0)
+//check vaSyncBuffer function
+vas = vaSyncBuffer(ctx->hwctx->display, 0, 0);
+if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) {
+ctx->has_sync_buffer_func = 1;
+ctx->encode_fifo = av_fifo_alloc2(MAX_PICTURE_REFERENCES + 1,
+  sizeof(VAAPIEncodePicture *),
+  0);
+if (!ctx->encode_fifo)
+return AVERROR(ENOMEM);
+}
+#endif
+
 return 0;
 
 fail:
@@ -2592,6 +2633,7 @@ av_cold int ff_vaapi_encode_close(AVCodecContext *avctx)
 
 av_freep(>codec_sequence_params);
 av_freep(>codec_picture_params);
+av_fifo_freep2(>encode_fifo);
 
 av_buffer_unref(>recon_frames_ref);
 av_buffer_unref(>input_frames_ref);
diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h
index b41604a883..d33a486cb8 100644
--- a/libavcodec/vaapi_encode.h
+++ b/libavcodec/vaapi_encode.h
@@ -29,6 +29,7 @@
 
 #include "libavutil/hwcontext.h"
 #include "libavutil/hwcontext_vaapi.h"
+#include "libavutil/fifo.h"
 
 #include "avcodec.h"
 #include "hwconfig.h"
@@ -345,6 +346,10 @@ typedef struct VAAPIEncodeContext {
 int roi_warned;
 
 AVFrame *frame;
+//Store buffered pic
+AVFifo *encode_fifo;
+//Whether the driver support vaSyncBuffer
+int has_sync_buffer_func;
 } VAAPIEncodeContext;
 
 enum {
-- 
2.32.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH V3 1/3] libavcodec/vaapi_encode: Add new API adaption to vaapi_encode

2022-02-07 Thread Wenbin Chen
Add vaSyncBuffer to VAAPI encoder. Old version API vaSyncSurface wait
surface to complete. When surface is used for multiple operation, it
waits all operations to finish. vaSyncBuffer only wait one channel to
finish.

Add wait param to vaapi_encode_wait() to prepare for the async_depth
option. "wait=1" means wait until operation ready. "wait=0" means
query operation's status. If it is ready return 0, if it is still
in progress return EAGAIN.

Signed-off-by: Wenbin Chen 
---
 libavcodec/vaapi_encode.c | 47 +--
 1 file changed, 40 insertions(+), 7 deletions(-)

diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index 3bf379b1a0..b87b58a42b 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -134,7 +134,8 @@ static int 
vaapi_encode_make_misc_param_buffer(AVCodecContext *avctx,
 }
 
 static int vaapi_encode_wait(AVCodecContext *avctx,
- VAAPIEncodePicture *pic)
+ VAAPIEncodePicture *pic,
+ uint8_t wait)
 {
 VAAPIEncodeContext *ctx = avctx->priv_data;
 VAStatus vas;
@@ -150,11 +151,43 @@ static int vaapi_encode_wait(AVCodecContext *avctx,
"(input surface %#x).\n", pic->display_order,
pic->encode_order, pic->input_surface);
 
-vas = vaSyncSurface(ctx->hwctx->display, pic->input_surface);
-if (vas != VA_STATUS_SUCCESS) {
-av_log(avctx, AV_LOG_ERROR, "Failed to sync to picture completion: "
-   "%d (%s).\n", vas, vaErrorStr(vas));
+#if VA_CHECK_VERSION(1, 9, 0)
+// Try vaSyncBuffer.
+vas = vaSyncBuffer(ctx->hwctx->display,
+   pic->output_buffer,
+   wait ? VA_TIMEOUT_INFINITE : 0);
+if (vas == VA_STATUS_ERROR_TIMEDOUT) {
+return AVERROR(EAGAIN);
+} else if (vas != VA_STATUS_SUCCESS && vas != 
VA_STATUS_ERROR_UNIMPLEMENTED) {
+av_log(avctx, AV_LOG_ERROR, "Failed to sync to output buffer 
completion: "
+"%d (%s).\n", vas, vaErrorStr(vas));
 return AVERROR(EIO);
+} else if (vas == VA_STATUS_ERROR_UNIMPLEMENTED)
+// If vaSyncBuffer is not implemented, try old version API.
+#endif
+{
+if (!wait) {
+VASurfaceStatus surface_status;
+vas = vaQuerySurfaceStatus(ctx->hwctx->display,
+pic->input_surface,
+_status);
+if (vas == VA_STATUS_SUCCESS &&
+surface_status != VASurfaceReady &&
+surface_status != VASurfaceSkipped) {
+return AVERROR(EAGAIN);
+} else if (vas != VA_STATUS_SUCCESS) {
+av_log(avctx, AV_LOG_ERROR, "Failed to query surface status: "
+"%d (%s).\n", vas, vaErrorStr(vas));
+return AVERROR(EIO);
+}
+} else {
+vas = vaSyncSurface(ctx->hwctx->display, pic->input_surface);
+if (vas != VA_STATUS_SUCCESS) {
+av_log(avctx, AV_LOG_ERROR, "Failed to sync to picture 
completion: "
+"%d (%s).\n", vas, vaErrorStr(vas));
+return AVERROR(EIO);
+}
+}
 }
 
 // Input is definitely finished with now.
@@ -633,7 +666,7 @@ static int vaapi_encode_output(AVCodecContext *avctx,
 uint8_t *ptr;
 int err;
 
-err = vaapi_encode_wait(avctx, pic);
+err = vaapi_encode_wait(avctx, pic, 1);
 if (err < 0)
 return err;
 
@@ -695,7 +728,7 @@ fail:
 static int vaapi_encode_discard(AVCodecContext *avctx,
 VAAPIEncodePicture *pic)
 {
-vaapi_encode_wait(avctx, pic);
+vaapi_encode_wait(avctx, pic, 1);
 
 if (pic->output_buffer_ref) {
 av_log(avctx, AV_LOG_DEBUG, "Discard output for pic "
-- 
2.32.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 2/2] lavc/aarch64: add hevc epel assembly

2022-02-07 Thread Martin Storsjö

On Thu, 3 Feb 2022, J. Dekker wrote:


Thanks: Rafal Dabrowa 
---
libavcodec/aarch64/Makefile   |3 +-
libavcodec/aarch64/hevcdsp_epel_neon.S| 2501 +
libavcodec/aarch64/hevcdsp_init_aarch64.c |   52 +
3 files changed, 2555 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/aarch64/hevcdsp_epel_neon.S


The same comments as for the qpel code apply here too.

// Martin

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/2] lavc/aarch64: add hevc qpel assembly

2022-02-07 Thread Martin Storsjö

On Thu, 3 Feb 2022, J. Dekker wrote:


Thanks: Rafal Dabrowa 
---
libavcodec/aarch64/Makefile   |1 +
libavcodec/aarch64/hevcdsp_init_aarch64.c |   67 +
libavcodec/aarch64/hevcdsp_qpel_neon.S| 2799 +
3 files changed, 2867 insertions(+)
create mode 100644 libavcodec/aarch64/hevcdsp_qpel_neon.S

Had trouble testing on a Linux machine as well, but have a workflow
setup for that now so should be easier in the future. Passes FATE on
both macOS and Linux.



+NEON8_FNPROTO(qpel_h, (int16_t *dst,
+uint8_t *src, ptrdiff_t srcstride,
+int height, intptr_t mx, intptr_t my, int width));


Passing a whole parenthesized expression like this, via one macro 
parameter, feels quite unorthodox to me, but it does seem to work now with 
all compilers I have to test with, so I guess it's tolerable that way.



+
+#include "libavutil/aarch64/asm.S"
+#define MAX_PB_SIZE 64
+
+.Lqpel_filters:
+.byte  0,  0,  0,  0,  0,  0, 0,  0


This assembles incorrectly with gas-preprocessor targeting MSVC armasm64.

Normally we enclose all such constants in const/endconst, which sets up 
the appropriate section and all that. But if put into the const data 
section, it's probably too far away for an 'adr' instruction, so then 
you'd need to use the movrel macro (expanding to 'adrp' + 'add').


A less elegant workaround for armasm/gas-preprocessor is to just add a 
'.text' above this.



+.byte -1,  4,-10, 58, 17, -5, 1,  0
+.byte -1,  4,-11, 40, 40,-11, 4, -1
+.byte  0,  1, -5, 17, 58,-10, 4, -1
+
+.macro load_qpel_filterb freg, xreg
+adr \xreg, .Lqpel_filters
+add \xreg, \xreg, \freg, lsl #3
+ld4r   {v0.16b, v1.16b, v2.16b, v3.16b}, [\xreg], #4
+ld4r   {v4.16b, v5.16b, v6.16b, v7.16b}, [\xreg]


Please follow the normal coding style (align the starting '{' just like 
other characters at the start of the operand column, don't leave it 
outside. This goes for the whole file.



+neg v0.16b, v0.16b
+neg v2.16b, v2.16b
+neg v5.16b, v5.16b
+neg v7.16b, v7.16b


Why these negations? Can't you just change the corresponding umlsl/umlal 
instructions matchingly?


Also, can't those umlsl/umlal use the elementwise form, e.g. v0.b[0], so 
you wouldn't need to waste 8 full registers on the coefficients? (If 
you've got enough registers so you don't need to clobber v8-v15, there's 
probably no benefit in squeezing things tighter though. But if there's 
code that could be made more efficient if you'd have more spare registers, 
that could help.)



+.endm
+
+.macro calc_qpelb dst, src0, src1, src2, src3, src4, src5, src6, src7
+umlsl   \dst\().8h, \src0\().8b, v0.8b


Could this first one be plain 'umull' (if you wouldn't negate the 
coefficient), avoiding an extra 'movi v28.8h, #0'?



+umlal   \dst\().8h, \src1\().8b, v1.8b
+umlsl   \dst\().8h, \src2\().8b, v2.8b
+umlal   \dst\().8h, \src3\().8b, v3.8b
+umlal   \dst\().8h, \src4\().8b, v4.8b
+umlsl   \dst\().8h, \src5\().8b, v5.8b
+umlal   \dst\().8h, \src6\().8b, v6.8b
+umlsl   \dst\().8h, \src7\().8b, v7.8b
+.endm
+
+.macro calc_qpelb2 dst, src0, src1, src2, src3, src4, src5, src6, src7
+umlsl2  \dst\().8h, \src0\().16b, v0.16b
+umlal2  \dst\().8h, \src1\().16b, v1.16b
+umlsl2  \dst\().8h, \src2\().16b, v2.16b
+umlal2  \dst\().8h, \src3\().16b, v3.16b
+umlal2  \dst\().8h, \src4\().16b, v4.16b
+umlsl2  \dst\().8h, \src5\().16b, v5.16b
+umlal2  \dst\().8h, \src6\().16b, v6.16b
+umlsl2  \dst\().8h, \src7\().16b, v7.16b
+.endm
+
+.macro load_qpel_filterh freg, xreg
+adr \xreg, .Lqpel_filters
+add \xreg, \xreg, \freg, lsl #3
+ld1{v0.8b}, [\xreg]
+sxtlv0.8h, v0.8b
+.endm
+
+.macro calc_qpelh dst, src0, src1, src2, src3, src4, src5, src6, src7, op, 
shift=6
+smull   \dst\().4s, \src0\().4h, v0.h[0]
+smlal   \dst\().4s, \src1\().4h, v0.h[1]
+smlal   \dst\().4s, \src2\().4h, v0.h[2]
+smlal   \dst\().4s, \src3\().4h, v0.h[3]
+smlal   \dst\().4s, \src4\().4h, v0.h[4]
+smlal   \dst\().4s, \src5\().4h, v0.h[5]
+smlal   \dst\().4s, \src6\().4h, v0.h[6]
+smlal   \dst\().4s, \src7\().4h, v0.h[7]
+.ifc \op, sshr
+sshr\dst\().4s, \dst\().4s, \shift
+.else
+\op \dst\().4h, \dst\().4s, \shift
+.endif
+.endm
+
+.macro calc_qpelh2 dst, dstt, src0, src1, src2, src3, src4, src5, src6, src7, 
op, shift=6
+smull2  \dstt\().4s, \src0\().8h, v0.h[0]
+smlal2  

Re: [FFmpeg-devel] [PATCH] configure: Fix Microsoft tools detection

2022-02-07 Thread Martin Storsjö

On Thu, 3 Feb 2022, Marvin Scholz wrote:




On 3 Feb 2022, at 12:55, Hendrik Leppkes wrote:

On Thu, Feb 3, 2022 at 12:34 PM Martin Storsjö  
wrote:


On Thu, 3 Feb 2022, Kacper Michajlow wrote:

On Wed, 26 Jan 2022 at 15:00, Martin Storsjö  
wrote:


Hi,

On Sat, 22 Jan 2022, Kacper Michajłow wrote:

LLVM tools print installation path upon execution. If one uses 
LLVM
tools bundled with Microsoft Visual Studio installation, they 
would be

incorrectly detected as Microsoft's ones.

Signed-off-by: Kacper Michajłow 
---
configure | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)


While the patch description seems to make sense, I wanted to try it 
out to
see the practical effect for myself, and I fail to observe any 
difference.


Can you provide your exact configure command line you use, where it 
makes
a difference? I tried with "--cc=clang-cl --ld=lld-link 
--toolchain=msvc"

and that works just as fine before this patch.

In particular, the commands that you adjust run "$_cc -nologo-" and 
grep
for "Microsoft" in the output of that. When I run that with 
clang-cl, it

doesn't print a string containing "Microsoft".

// Martin


Hi,

Yes you are right. In case of CC it doesn't change anything. 
clang-cl

prints installation dir only with `-v`. The main thing this patch
fixes is `--ar=llvm-ar` where it is mistaken for lib.exe and used 
with

wrong parameters. While fixing this I figured to make CC check also
more strict, because at some point it could be a problem. Sync all 
of

them to have same style as one that was already there


Oh, ok, with the reference to llvm-ar, I see what it fixes. Thanks! 
The
reference to llvm-ar absolutely needs to be in the patch description 
then.


I remember that there has been some variance throughout the versions 
for
exactly what MSVC prints as the identification thoughout the 
versions, but

I think 'Microsoft.*Optimizing.*Compiler' should be safe.



I was wondering if non-english locale would translate that string, but
I can't easily test that, I don't think.



Sorry, need to correct myself. It is indeed localized I was just lacking
the language pack…

For example in german it is:

Microsoft (R) C/C++-Optimierungscompiler Version 19.30.30709 für x64
Copyright (C) Microsoft Corporation. Alle Rechte vorbehalten.


So, should we scale back on this patch to only extend the regex for 
lib.exe (to 'Microsoft.*Library.*Manager') but leave the one for cl.exe as 
it is? Marvin verified that lib.exe doesn't seem to print a localized 
message, only cl.exe seems to do that. Otherwise there's a risk we'd break 
the currently working detection of cl.exe for users of localized MSVC.


// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] aarch64: h264dsp: Fix incorrectly indented code

2022-02-07 Thread Martin Storsjö
Signed-off-by: Martin Storsjö 
---
This should reduce the risk of anyone accidentally writing new code
based on an incorrect example.
---
 libavcodec/aarch64/h264dsp_neon.S | 176 +++---
 1 file changed, 88 insertions(+), 88 deletions(-)

diff --git a/libavcodec/aarch64/h264dsp_neon.S 
b/libavcodec/aarch64/h264dsp_neon.S
index 000ff762a3..ea221e6862 100644
--- a/libavcodec/aarch64/h264dsp_neon.S
+++ b/libavcodec/aarch64/h264dsp_neon.S
@@ -960,117 +960,117 @@ function ff_h264_h_loop_filter_chroma422_neon_10, 
export=1
 endfunc
 
 .macro h264_loop_filter_chroma_intra_10
-   uabdv26.8h,  v16.8h,  v17.8h  // abs(p0 - q0)
-   uabdv27.8h,  v18.8h,  v16.8h  // abs(p1 - p0)
-   uabdv28.8h,  v19.8h,  v17.8h  // abs(q1 - q0)
-   cmhiv26.8h,  v30.8h,  v26.8h  // < alpha
-   cmhiv27.8h,  v31.8h,  v27.8h  // < beta
-   cmhiv28.8h,  v31.8h,  v28.8h  // < beta
-   and v26.16b, v26.16b, v27.16b
-   and v26.16b, v26.16b, v28.16b
-   mov x2, v26.d[0]
-   mov x3, v26.d[1]
-
-   shl v4.8h,  v18.8h,  #1
-   shl v6.8h,  v19.8h,  #1
-
-   addsx2,  x2,  x3
-   b.eq9f
-
-   add v20.8h,  v16.8h,  v19.8h
-   add v22.8h,  v17.8h,  v18.8h
-   add v20.8h,  v20.8h,  v4.8h
-   add v22.8h,  v22.8h,  v6.8h
-   urshr   v24.8h,  v20.8h,  #2
-   urshr   v25.8h,  v22.8h,  #2
-   bit v16.16b, v24.16b, v26.16b
-   bit v17.16b, v25.16b, v26.16b
+uabdv26.8h,  v16.8h,  v17.8h  // abs(p0 - q0)
+uabdv27.8h,  v18.8h,  v16.8h  // abs(p1 - p0)
+uabdv28.8h,  v19.8h,  v17.8h  // abs(q1 - q0)
+cmhiv26.8h,  v30.8h,  v26.8h  // < alpha
+cmhiv27.8h,  v31.8h,  v27.8h  // < beta
+cmhiv28.8h,  v31.8h,  v28.8h  // < beta
+and v26.16b, v26.16b, v27.16b
+and v26.16b, v26.16b, v28.16b
+mov x2, v26.d[0]
+mov x3, v26.d[1]
+
+shl v4.8h,  v18.8h,  #1
+shl v6.8h,  v19.8h,  #1
+
+addsx2,  x2,  x3
+b.eq9f
+
+add v20.8h,  v16.8h,  v19.8h
+add v22.8h,  v17.8h,  v18.8h
+add v20.8h,  v20.8h,  v4.8h
+add v22.8h,  v22.8h,  v6.8h
+urshr   v24.8h,  v20.8h,  #2
+urshr   v25.8h,  v22.8h,  #2
+bit v16.16b, v24.16b, v26.16b
+bit v17.16b, v25.16b, v26.16b
 .endm
 
 function ff_h264_v_loop_filter_chroma_intra_neon_10, export=1
-   h264_loop_filter_start_intra_10
-   mov x9,  x0
-   sub x0,  x0,  x1, lsl #1
-   ld1 {v18.8h}, [x0], x1
-   ld1 {v17.8h}, [x9], x1
-   ld1 {v16.8h}, [x0], x1
-   ld1 {v19.8h}, [x9]
+h264_loop_filter_start_intra_10
+mov x9,  x0
+sub x0,  x0,  x1, lsl #1
+ld1 {v18.8h}, [x0], x1
+ld1 {v17.8h}, [x9], x1
+ld1 {v16.8h}, [x0], x1
+ld1 {v19.8h}, [x9]
 
-   h264_loop_filter_chroma_intra_10
+h264_loop_filter_chroma_intra_10
 
-   sub x0,  x9,  x1, lsl #1
-   st1 {v16.8h}, [x0], x1
-   st1 {v17.8h}, [x0], x1
+sub x0,  x9,  x1, lsl #1
+st1 {v16.8h}, [x0], x1
+st1 {v17.8h}, [x0], x1
 
 9:
-   ret
+ret
 endfunc
 
 function ff_h264_h_loop_filter_chroma_mbaff_intra_neon_10, export=1
-   h264_loop_filter_start_intra_10
+h264_loop_filter_start_intra_10
 
-   sub x4,  x0,  #4
-   sub x0,  x0,  #2
-   add x9,  x4,  x1, lsl #1
-   ld1 {v18.8h}, [x4], x1
-   ld1 {v17.8h}, [x9], x1
-   ld1 {v16.8h}, [x4], x1
-   ld1 {v19.8h}, [x9], x1
+sub x4,  x0,  #4
+sub x0,  x0,  #2
+add x9,  x4,  x1, lsl #1
+ld1 {v18.8h}, [x4], x1
+ld1 {v17.8h}, [x9], x1
+ld1 {v16.8h}, [x4], x1
+ld1 {v19.8h}, [x9], x1
 
-   transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29
+transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29
 
-   h264_loop_filter_chroma_intra_10
+h264_loop_filter_chroma_intra_10
 
-   st2 {v16.h,v17.h}[0], [x0], x1
-   st2 {v16.h,v17.h}[1], [x0], x1
-   st2 {v16.h,v17.h}[2], [x0], x1
-   st2 

Re: [FFmpeg-devel] [PATCH] avformat/mxfdec: add avlanguage dependency

2022-02-07 Thread Tomas Härdin
lör 2022-02-05 klockan 22:59 +1000 skrev Zane van Iperen:
> Signed-off-by: Zane van Iperen 
> ---
>  libavformat/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavformat/Makefile b/libavformat/Makefile
> index 3dc6a479cc..6566e40cac 100644
> --- a/libavformat/Makefile
> +++ b/libavformat/Makefile
> @@ -374,7 +374,7 @@ OBJS-$(CONFIG_MTV_DEMUXER)   += mtv.o
>  OBJS-$(CONFIG_MUSX_DEMUXER)  += musx.o
>  OBJS-$(CONFIG_MV_DEMUXER)    += mvdec.o
>  OBJS-$(CONFIG_MVI_DEMUXER)   += mvi.o
> -OBJS-$(CONFIG_MXF_DEMUXER)   += mxfdec.o mxf.o
> +OBJS-$(CONFIG_MXF_DEMUXER)   += mxfdec.o mxf.o
> avlanguage.o
>  OBJS-$(CONFIG_MXF_MUXER) += mxfenc.o mxf.o avc.o
>  OBJS-$(CONFIG_MXG_DEMUXER)   += mxg.o
>  OBJS-$(CONFIG_NC_DEMUXER)    += ncdec.o

Looks OK

/Tomas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 1/1] avformat: Add IPFS protocol support.

2022-02-07 Thread Tomas Härdin
fre 2022-02-04 klockan 15:12 +0100 skrev Mark Gaiser:
> On Fri, Feb 4, 2022 at 12:10 PM Tomas Härdin 
> wrote:
> 
> > tor 2022-02-03 klockan 18:29 +0100 skrev Mark Gaiser:
> > > 
> > > +typedef struct IPFSGatewayContext {
> > > +    AVClass *class;
> > > +    URLContext *inner;
> > > +    char *gateway;
> > 
> > Consider two separate variables. One for AVOption and one for the
> > dynamically allocated string. Or put the latter on the stack.
> > 
> 
> There always needs to be a gateway so why is reusing that variable an
> issue?
> I'm fine splitting it up but I'd like to understand the benefit of it
> as
> currently I don't see that benefit.

Because of the way AVOption memory allocation works

> 
> > > +static int populate_ipfs_gateway(URLContext *h)
> > > +{
> > > +    IPFSGatewayContext *c = h->priv_data;
> > > +    char *ipfs_full_data_folder = NULL;
> > > +    char *ipfs_gateway_file = NULL;
> > 
> > These can be char[PATH_MAX]
> > 
> 
> Oke, will do.
> C code question though.
> How do I use av_asprintf on stack arrays like that?

snprintf(). Also be careful with PATH_MAX and +-1 bytes for the NUL.

> 
> > Again, there is no reason to stat this. Just try opening the
> > gateway
> > file directly.
> > 
> 
> This is a folder, not a file.
> 
> The other stat that was here too was a file, I replaced that with an
> fopen.
> It smells sketchy to me to (ab)use fopen to check if a folder exists.
> There's stat for that.

You don't need to check whether the folder exists at all. The only
thing that accomplishes is some AV_LOG_DEBUG prints that won't even get
compiled in unless a users builds with -g (I think). It's not sketchy -
it's spec'd behavior.

> 
> 
> > 
> > > +
> > > +    // Read a single line (fgets stops at new line mark).
> > > +    fgets(gateway_file_data, sizeof(gateway_file_data) - 1,
> > > gateway_file);
> > 
> > This can result in gateway_file_data not being NUL terminated
> 
> 
> > > +
> > > +    // Replace first occurence of end of line to \0
> > > +    gateway_file_data[strcspn(gateway_file_data, "\r\n")] = 0;
> > 
> > What if the file uses \n or no newlines at all?
> > 
> 
> Right.
> So I guess the fix here is:
> 1. Initialize gateway_file_data so all bytes are zero
> 2. read a line
> 3. set the last byte of gateway_file_data to 0
> 
> Now any text in the string will be the gateway.
> 
> Is that a proper fix?

Yes always putting a NUL at the end works. You don't need to initialize
with zero in that case. fgets() will NUL terminate except when there's
an error like the line being too long.

> 
> 
> > > +err:
> > > +    if (gateway_file)
> > > +    fclose(gateway_file);
> > > +
> > > +    av_free(ipfs_full_data_folder);
> > > +    av_free(ipfs_gateway_file);
> > 
> > This is not cleaning up dynamic allocations of c->gateway
> > 
> 
> So I should do that in  ipfs_close, right?

That's one place to do it yes. I forget whether _close() is called in
case of errors. av_freep() will set the pointer to NULL after freeing
so no double-frees occur.

> 
> > 
> > 
> > > +    // Test if the gateway starts with either http:// or
> > > https://
> > > +    // The remainder is stored in url_without_protocol
> > > +    if (av_stristart(uri, "http://;, _without_protocol) == 0
> > > +    && av_stristart(uri, "https://;, _without_protocol)
> > > ==
> > > 0) {
> > > +    av_log(h, AV_LOG_ERROR, "The gateway URL didn't start
> > > with
> > > http:// or https:// and is therefore invalid.\n");
> > > +    ret = -2;
> > > +    goto err;
> > > +    }
> > 
> > I guess restricting this to HTTP schemes is OK. Or are there non-
> > HTTP
> > gateways for this?
> > 
> 
> No.
> At least not from the IPFS camp.
> The IPFS software creates a gateway and that is specifically an http
> gateway.
> Users can put that behind a proxy making it (potentially) a https
> gateway
> but that's about it.

I see. I guess if any user puts this stuff behind gopher:// or
something then that's their problem.

> 
> > 
> > > +    if (last_gateway_char != '/') {
> > > +    c->gateway = av_asprintf("%s/", c->gateway);
> > 
> > Yet another leak
> > 
> 
> Please tell me how to fix this one.
> As you can see, I need the c->gateway value to copy and add a "/" to
> it.
> 
> In C++ this would just be a dead simple append ;)

Ensure there's enough space for '/' and a NUL and just write that to
the end.

snprintf() can do all of this if used appropriately. For example to
conditionally append "/" you can put %s in the format string and the
ternary

 needs_slash ? "/" : ""

as the associated argument

/Tomas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] libavutil: include assembly with full path from source root

2022-02-07 Thread Alexander Kanavin
On Mon, 31 Jan 2022 at 14:29, Alexander Kanavin 
wrote:

> On Mon, 31 Jan 2022 at 13:52, Anton Khirnov  wrote:
>
>> With a separate build directory, I'm getting
>> $ strings libavutil/x86/tx_float.o |grep asm
>> src/libavutil/x86/tx_float.asm
>>
>
> The key piece is
> ../configure --disable-stripping
>
> With stripping disabled and without the patch, you should see:
>
> alex@alex-lx-laptop:~/development/ffmpeg/build$ strings
> libavutil/x86/tx_float.o |grep asm
> src/libavutil/x86/tx_float.asm
> src/libavutil/x86/tx_float.asm
> /home/alex/development/ffmpeg/libavutil/x86/x86util.asm
> src/libavutil/x86/tx_float.asm
>

Ping, please :)

Alex
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Libera.chat irc logs

2022-02-07 Thread Michael Niedermayer
On Sun, Feb 06, 2022 at 10:36:41AM +0100, Andreas Rheinhardt wrote:
> Gyan Doshi:
> > 
> > In our documentation, we state
> > 
> > 'Our IRC channels are publically logged and archives of both channels
> > can be viewed at ffmpeg-devel-irc.'
> > 
> > However, the archives in the link end with June 2020 i.e. at the time of
> > the Libera switchover. The topic for #ffmpeg-devel at Libera does
> > include "This channel is publicly logged".
> > 
> > Where do we host logs for chats on Libera?
> > 
> > Regards,
> > Gyan
> 
> The logs stopped way before the libera switchover (which was only in
> 2021, not June 2020). There were issues even before then.

if someone can setup a new (more reliable) IRC log bot which sends
mails to the ffmpeg-devel-irc ML. That would be welcome

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avformat/udp: properly check for valid ttl in url

2022-02-07 Thread lance . lmwang
From: Limin Wang 

Zhao Zhili added a ttl upper bound in commit 9daac85da8,
but the check for ttl in url is missing still.

Signed-off-by: Limin Wang 
---
 libavformat/udp.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavformat/udp.c b/libavformat/udp.c
index da56c8e..401d9b6 100644
--- a/libavformat/udp.c
+++ b/libavformat/udp.c
@@ -674,6 +674,11 @@ static int udp_open(URLContext *h, const char *uri, int 
flags)
 }
 if (av_find_info_tag(buf, sizeof(buf), "ttl", p)) {
 s->ttl = strtol(buf, NULL, 10);
+if (s->ttl < 0 || s->ttl > 255) {
+av_log(h, AV_LOG_ERROR, "ttl(%d) should be in range 
[0,255]\n", s->ttl);
+ret = AVERROR(EINVAL);
+goto fail;
+}
 }
 if (av_find_info_tag(buf, sizeof(buf), "udplite_coverage", p)) {
 s->udplite_coverage = strtol(buf, NULL, 10);
-- 
1.8.3.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".