Re: [FFmpeg-devel] [PATCH v2 2/2] swscale/aarch64: add hscale specializations

2022-05-25 Thread Martin Storsjö
On Wed, 25 May 2022, Swinney, Jonathan wrote: This patch adds code to support specializations of the hscale function and adds a specialization for filterSize == 4. ff_hscale8to15_4_neon is a complete rewrite. Since the main bottleneck here is loading the data from src, this data is loaded a who

Re: [FFmpeg-devel] [PATCH v2 1/2] checkasm: added additional dstW tests for hscale

2022-05-25 Thread Martin Storsjö
On Wed, 25 May 2022, Swinney, Jonathan wrote: Signed-off-by: Jonathan Swinney --- tests/checkasm/sw_scale.c | 38 ++ 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c index 3c0a083b42..6c223c4

Re: [FFmpeg-devel] [PATCH v2 0/2] checkasm: added additional dstW tests for hscale

2022-05-25 Thread Martin Storsjö
On Wed, 25 May 2022, Swinney, Jonathan wrote: This is a resubmission of changes to the hscale function for aarch64. I added a test as a separate patch so that it would be easier to get consistent before and after performance data. After Martin already submitted the improvement to the final sec

[FFmpeg-devel] [PATCH] checkasm: Silence warnings about unused return value from read()

2022-05-25 Thread Martin Storsjö
This codepath is enabled by default on arm, if the linux perf API is available, unless disabled with --disable-linux-perf. --- tests/checkasm/checkasm.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h index a86db140e3..7f

Re: [FFmpeg-devel] [PATCH] checkasm: improve hevc_sao test

2022-05-25 Thread Martin Storsjö
On Wed, 25 May 2022, J. Dekker wrote: On 24 May 2022, at 22:27, Martin Storsjö wrote: On Tue, 17 May 2022, J. Dekker wrote: The HEVC decoder can call these functions with smaller widths than the functions themselves are designed to operate on so we should only check the relevant output

Re: [FFmpeg-devel] [PATCH v7 0/3] Support long file names on Windows

2022-05-25 Thread Martin Storsjö
On Tue, 24 May 2022, ffmpegagent wrote: This patchset adds support for long file and directory paths on Windows. The implementation follows the same logic that .NET is using internally, with the only exception that it doesn't expand short path components in 8.3 format. .NET does this as the same

Re: [FFmpeg-devel] [PATCH v6 2/2] avformat/os_support: Support long file names on Windows

2022-05-25 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: Martin Storsjö Sent: Tuesday, May 24, 2022 10:59 PM To: softworkz Cc: ffmpeg-devel@ffmpeg.org; Soft Works ; Hendrik Leppkes Subject: Re: [PATCH v6 2/2] avformat/os_support: Support long file names on Windows

Re: [FFmpeg-devel] [PATCH v6 2/2] avformat/os_support: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- libavformat/os_support.h | 87 +--- 1 file changed, 63 insertions(+), 24 deletions(-) diff --git a/libavformat/os_support.h b/libavformat/os_support.h index 5e6b32d2dc..179b9

Re: [FFmpeg-devel] [PATCH v6 1/2] avutil/wchar_filename, file_open: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- libavutil/file_open.c | 2 +- libavutil/wchar_filename.h | 180 + 2 files changed, 181 insertions(+), 1 deletion(-) This looks ok to me now, thanks! // Martin __

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Tuesday, May 24, 2022 10:22 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8 On Tue, 24 May 2022

Re: [FFmpeg-devel] [PATCH] checkasm: improve hevc_sao test

2022-05-24 Thread Martin Storsjö
On Tue, 17 May 2022, J. Dekker wrote: The HEVC decoder can call these functions with smaller widths than the functions themselves are designed to operate on so we should only check the relevant output Signed-off-by: J. Dekker --- tests/checkasm/hevc_sao.c | 51 -

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Tuesday, May 24, 2022 11:29 AM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8 On Mon, 23 May 2022

Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: Martin Storsjö Sent: Tuesday, May 24, 2022 1:26 PM To: Soft Works Cc: FFmpeg development discussions and patches ; Hendrik Leppkes Subject: RE: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names

Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: Martin Storsjö Sent: Tuesday, May 24, 2022 12:26 PM To: Soft Works Cc: FFmpeg development discussions and patches ; Hendrik Leppkes Subject: RE: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names

Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, Soft Works wrote: -Original Message- From: Martin Storsjö Sent: Tuesday, May 24, 2022 11:23 AM To: FFmpeg development discussions and patches Cc: softworkz ; Hendrik Leppkes Subject: Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-24 Thread Martin Storsjö
On Mon, 23 May 2022, Soft Works wrote: Great. I rebased and resubmitted both patchsets. The primary long-path patchset didn't need any change. Considerations for the latter were: - Should the file wchar_filename.h be renamed as it is now containing the path prefixing code? I guess we could

Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- libavformat/os_support.h | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/libavformat/os_support.h b/libavformat/os_support.h index 5e6b32d2dc..d4c07803a5 100644 --- a/libavf

Re: [FFmpeg-devel] [PATCH v5 1/2] avutil/wchar_filename, file_open: Support long file names on Windows

2022-05-24 Thread Martin Storsjö
On Tue, 24 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- libavutil/file_open.c | 2 +- libavutil/wchar_filename.h | 166 + 2 files changed, 167 insertions(+), 1 deletion(-) diff --git a/libavutil/file_open.c b/libavutil/file_

Re: [FFmpeg-devel] [PATCH v7 0/2] use av_fopen_utf8() instead of plain fopen()

2022-05-24 Thread Martin Storsjö
On Mon, 23 May 2022, ffmpegagent wrote: Unify file access operations by replacing usages of direct calls to posix fopen() v2: Remove changes to fftools for now v3: Add some additional replacements v4: Fix and improve commit messages v5: Add patch to remap ff_open in libavfilter for MSVC on Wind

Re: [FFmpeg-devel] [PATCH v5] avcodec/mfenc: Dynamically load MFPlat.DLL

2022-05-24 Thread Martin Storsjö
, avcodec will be linked directly against MFPlat.DLL. - MediaFoundation functions are now called like MFTEnumEx, like Martin Storsjö suggested in his review of the v3. I forgot to mention it on earlier versions, this patch addresses https://trac.ffmpeg.org/ticket/9788. diff --git a/libavcodec

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-23 Thread Martin Storsjö
On Mon, 23 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Monday, May 23, 2022 12:58 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8 On Mon, 23 May 2022

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-23 Thread Martin Storsjö
On Mon, 23 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Monday, May 23, 2022 12:53 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8 On Sat, 21 May 2022

Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-23 Thread Martin Storsjö
On Sat, 21 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Friday, May 20, 2022 11:13 PM To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8 Provide a header based inline reimplementation

[FFmpeg-devel] [PATCH 3/3] Switch uses of av_fopen_utf8 to avpriv_fopen_utf8

2022-05-20 Thread Martin Storsjö
--- libavfilter/af_arnndn.c | 2 +- libavfilter/opencl.c | 2 +- libavfilter/vf_curves.c | 2 +- libavfilter/vf_dnn_classify.c | 2 +- libavfilter/vf_dnn_detect.c | 2 +- libavfilter/vf_fieldhint.c| 2 +- libavfilter/vf_lut3d.c| 4 ++-- libavfilter/vf_nnedi.c

[FFmpeg-devel] [PATCH 2/3] libavutil: Deprecate av_fopen_utf8, provide an avpriv version

2022-05-20 Thread Martin Storsjö
Since every DLL can use an individual CRT on Windows, having an exported function that opens a FILE* won't work if that FILE* is going to be used from a different DLL (or from user application code). Internally within the libraries, the issue can be worked around by duplicating the function in all

[FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8

2022-05-20 Thread Martin Storsjö
Provide a header based inline reimplementation of it. Using av_fopen_utf8 doesn't work outside of the libraries when built with MSVC as shared libraries (in the default configuration, where each DLL gets a separate statically linked CRT). --- fftools/ffmpeg_opt.c | 3 +- fftools/fopen_utf8.h | 7

Re: [FFmpeg-devel] [PATCH v3] avcodec/mfenc: Dynamically load MFPlat.DLL

2022-05-20 Thread Martin Storsjö
On Fri, 20 May 2022, Trystan Mata wrote: From 2bdef1bdb93efa40b7d3fe21270f9f23465bee90 Mon Sep 17 00:00:00 2001 From: Trystan Mata Date: Fri, 20 May 2022 14:26:49 +0200 Subject: [PATCH] avcodec/mfenc: Dynamically load MFPlat.DLL Allow builds of FFmpeg with MediaFoundation to work under N editi

Re: [FFmpeg-devel] [PATCH v2] avcodec/libx264: allow to disable definition of X264_API_IMPORTS macro

2022-05-20 Thread Martin Storsjö
On Fri, 20 May 2022, Derek Buitenhuis wrote: On 5/20/2022 5:37 PM, Soft Works wrote: But if Matt's patch would be agreeable, then that would surely be the best outcome. I can rebase and resubmit his patch if you would find it agreeable. Ah - that was not clear to me. If Ubuntu LTS does inde

Re: [FFmpeg-devel] [PATCH v2] avfilter: use av_fopen_utf8() instead of plain fopen()

2022-05-10 Thread Martin Storsjö
On Mon, 9 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- use av_fopen_utf8() instead of plain fopen() Unify file access operations by replacing usages of direct calls to posix fopen() v2: Remove changes to fftools for now Published-As: https://github.c

Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-05-09 Thread Martin Storsjö
On Mon, 9 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Monday, May 9, 2022 11:42 AM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows On Mon

Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-05-09 Thread Martin Storsjö
On Mon, 9 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Sunday, May 8, 2022 10:12 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows On

Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-05-09 Thread Martin Storsjö
On Mon, 9 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Sunday, May 8, 2022 10:02 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows On Sat

Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-05-08 Thread Martin Storsjö
On Sat, 7 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Andreas Rheinhardt Sent: Saturday, May 7, 2022 6:32 AM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows Soft Works: -O

Re: [FFmpeg-devel] [PATCH 1/2] fftools: use av_fopen_utf8() instead of plain fopen()

2022-05-08 Thread Martin Storsjö
On Sat, 7 May 2022, softworkz wrote: From: softworkz Signed-off-by: softworkz --- fftools/cmdutils.c | 6 +++--- fftools/ffmpeg.c | 4 ++-- fftools/opt_common.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-) Just for clarity (for someone looking at this individual mail thread o

Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-05-08 Thread Martin Storsjö
On Sat, 7 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of Martin Storsjö Sent: Wednesday, April 20, 2022 2:48 PM To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows Hi, I just became aware

Re: [FFmpeg-devel] [PATCH v11 1/6] libavutil/wchar_filename.h: Add whcartoutf8, wchartoansi and utf8toansi

2022-05-08 Thread Martin Storsjö
On Sat, 7 May 2022, Soft Works wrote: -Original Message- From: ffmpeg-devel On Behalf Of nil- admir...@mailo.com Sent: Friday, May 6, 2022 6:08 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v11 1/6] libavutil/wchar_filename.h: Add whcartoutf8, wchartoansi and utf8toa

Re: [FFmpeg-devel] [PATCH] lib*/version: Move library version functions into files of their own

2022-05-06 Thread Martin Storsjö
On Fri, 6 May 2022, Andreas Rheinhardt wrote: This avoids having to rebuild big files every time FFMPEG_VERSION changes (which it does with every commit). Signed-off-by: Andreas Rheinhardt --- Makefile | 4 ++- ffbuild/common.mak | 2 -- libavcodec/Makefile|

Re: [FFmpeg-devel] PATCH - libmad MP3 decoding support

2022-05-02 Thread Martin Storsjö
On Mon, 2 May 2022, David Fletcher wrote: On 2/5/2022, "Nicolas George" wrote: Is there a trac ticket? If not, please fill one: we would not want to keep that bug. Regards, -- Nicolas George Hi Nicolas, I'll prepare a test case to demonstrate the issue and fill in a ticket. As far as I c

Re: [FFmpeg-devel] [PATCH 3/3] lavc/aarch64: add hevc sao edge 8x8

2022-04-28 Thread Martin Storsjö
On Thu, 28 Apr 2022, J. Dekker wrote: bench on AWS Graviton: hevc_sao_edge_8x8_8_c: 516.0 hevc_sao_edge_8x8_8_neon: 81.0 Signed-off-by: J. Dekker --- libavcodec/aarch64/hevcdsp_init_aarch64.c | 3 ++ libavcodec/aarch64/hevcdsp_sao_neon.S | 51 +++ 2 files changed, 54 in

Re: [FFmpeg-devel] [PATCH] avcodec/openh264: return (DE|EN)CODER_NOT_FOUND if version check fails

2022-04-27 Thread Martin Storsjö
On Wed, 20 Apr 2022, Martin Storsjö wrote: On Fri, 18 Feb 2022, Andreas Schneider wrote: Signed-off-by: Andreas Schneider --- libavcodec/libopenh264dec.c | 2 +- libavcodec/libopenh264enc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/libopenh264dec.c b

Re: [FFmpeg-devel] [PATCH] arm64: Fix wrong BTI landing pad

2022-04-26 Thread Martin Storsjö
On Mon, 25 Apr 2022, Andre Kempe wrote: This patch fixes a wrong type of BTI landing pad when branching to functions instantiated via the fft*_neon macro. Although the previously employed paciasp instruction serves as a landing pad, for the ways that this function is invoked it is the wrong typ

Re: [FFmpeg-devel] [PATCH v11 1/6] libavutil/wchar_filename.h: Add whcartoutf8, wchartoansi and utf8toansi

2022-04-25 Thread Martin Storsjö
On Mon, 25 Apr 2022, Hendrik Leppkes wrote: On Mon, Apr 25, 2022 at 1:12 PM Soft Works wrote: From my point of view: ffmpeg is already working pretty well in handling long file paths (also with Unicode characters) when pre-fixing paths with \\?\, and this is working on all Windows versions wi

Re: [FFmpeg-devel] [PATCH] swscale: aarch64: Optimize the final summation in the hscale routine

2022-04-22 Thread Martin Storsjö
On Thu, 21 Apr 2022, Swinney, Jonathan wrote: Thanks for making this improvement. I will rebase my patches on your change. I also measured the performance on AWS Graviton 2 and 3. I added the numbers to your table. Before: Cortex A53 A72 A73 Graviton 2 Graviton

[FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing issue on Windows

2022-04-20 Thread Martin Storsjö
Hi, I just became aware of the av_fopen_utf8 function - which was introduced to fix path name translations on Windows - actually has a notable design flaw. Background: On Windows, a process can contain more than one C runtime (CRT); the system comes with two shared ones (UCRT and msvcrt.dl

Re: [FFmpeg-devel] [PATCH] avcodec/openh264: return (DE|EN)CODER_NOT_FOUND if version check fails

2022-04-20 Thread Martin Storsjö
On Fri, 18 Feb 2022, Andreas Schneider wrote: Signed-off-by: Andreas Schneider --- libavcodec/libopenh264dec.c | 2 +- libavcodec/libopenh264enc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c index 7f5e85402a..97d

Re: [FFmpeg-devel] [PATCH v9 6/6] fftools: Use UTF-8 on Windows

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: --- fftools/fftools.manifest | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fftools/fftools.manifest b/fftools/fftools.manifest index 30b7d8fe..d1ac1e4e 100644 --- a/fftools/fftools.manifest +++ b/fftools/fftools.manifest @@ -3,8 +3

Re: [FFmpeg-devel] [PATCH v9 5/6] fftools: Enable long path support on Windows (fixes #8885)

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: --- fftools/Makefile | 5 + fftools/fftools.manifest | 10 ++ fftools/manifest.rc | 3 +++ 3 files changed, 18 insertions(+) create mode 100644 fftools/fftools.manifest create mode 100644 fftools/manifest.rc I think the change he

Re: [FFmpeg-devel] [PATCH v9 4/6] fftools/cmdutils.c: Remove MAX_PATH limit and replace fopen with av_fopen_utf8

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: --- fftools/cmdutils.c | 38 +- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/fftools/cmdutils.c b/fftools/cmdutils.c index 5d7cdc3e..a66dbb22 100644 --- a/fftools/cmdutils.c +++ b/fftools/cmdutils.c @@

Re: [FFmpeg-devel] [PATCH v9 3/6] compat/w32dlfcn.h: Remove MAX_PATH limit and replace LoadLibraryExA with LoadLibraryExW

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: --- compat/w32dlfcn.h | 78 ++- 1 file changed, 64 insertions(+), 14 deletions(-) diff --git a/compat/w32dlfcn.h b/compat/w32dlfcn.h index 52a94efa..0f41f50b 100644 --- a/compat/w32dlfcn.h +++ b/compat/w32dlfcn.

Re: [FFmpeg-devel] [PATCH v9 2/6] libavformat/avisynth.c: Remove MAX_PATH limit

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: --- libavformat/avisynth.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/libavformat/avisynth.c b/libavformat/avisynth.c index 8ba2bdea..f7bea8c3 100644 --- a/libavformat/avisynth.c +++ b/libavformat/avisynth.c @@ -34,6 +

Re: [FFmpeg-devel] [PATCH v9 1/6] libavutil/wchar_filename.h: Add whcartoutf8, wchartoansi and utf8toansi

2022-04-20 Thread Martin Storsjö
On Fri, 15 Apr 2022, Nil Admirari wrote: These functions are going to be used in libavformat/avisynth.c and fftools/cmdutils.c remove MAX_PATH limit. --- libavutil/wchar_filename.h | 51 ++ 1 file changed, 51 insertions(+) I looked through this patchset now,

[FFmpeg-devel] [PATCH] swscale: aarch64: Optimize the final summation in the hscale routine

2022-04-20 Thread Martin Storsjö
, around 3-8% for the smaller filter sizes. Inspired by a patch by Jonathan Swinney . Signed-off-by: Martin Storsjö --- I'll go ahead and apply this patch within a few days if there's no opposition, as it should be a fairly uncontroversial change. --- libswscale/aarch64/hscale.S | 14 +++

Re: [FFmpeg-devel] [PATCH 1/2] swscale/aarch64: add hscale specializations

2022-04-20 Thread Martin Storsjö
On Sun, 17 Apr 2022, Martin Storsjö wrote: On Fri, 15 Apr 2022, Swinney, Jonathan wrote: This patch adds specializations for hscale for filterSize == 4 and 8 and converts the existing implementation for the X8 version. For the old code, now used for the X8 version, it improves the efficiency

Re: [FFmpeg-devel] [PATCH 2/2] swscale/aarch64: add vscale specializations

2022-04-19 Thread Martin Storsjö
On Fri, 15 Apr 2022, Swinney, Jonathan wrote: This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By using specialized code with unrolling to match the filterSize we can improve performance. | (seconds) | c6g | | | | | - | - | - | |

Re: [FFmpeg-devel] [PATCH 1/1] librtmp: use AVBPrint instead of char *

2022-04-19 Thread Martin Storsjö
On Tue, 19 Apr 2022, Marton Balint wrote: On Sat, 16 Apr 2022, Martin Storsjö wrote: On Fri, 15 Apr 2022, Tristan Matthews wrote: This avoids having to do one pass to calculate the full length to allocate followed by a second pass to actually append values. --- libavformat/librtmp.c

Re: [FFmpeg-devel] [FFmpeg-cvslog] doc: install css files along html docs

2022-04-19 Thread Martin Storsjö
On Mon, 18 Apr 2022, Timo Rothenpieler wrote: ffmpeg | branch: master | Timo Rothenpieler | Thu Apr 7 20:11:24 2022 +0200| [d5687236aba6fd31dd4369c290df9a5b1192e43e] | committer: Timo Rothenpieler doc: install css files along html docs http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=comm

Re: [FFmpeg-devel] [PATCH 2/2] swscale/aarch64: add vscale specializations

2022-04-16 Thread Martin Storsjö
On Fri, 15 Apr 2022, Swinney, Jonathan wrote: This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By using specialized code with unrolling to match the filterSize we can improve performance. | (seconds) | c6g | | | | | - | - | - | |

Re: [FFmpeg-devel] [PATCH 1/2] swscale/aarch64: add hscale specializations

2022-04-16 Thread Martin Storsjö
On Fri, 15 Apr 2022, Swinney, Jonathan wrote: This patch adds specializations for hscale for filterSize == 4 and 8 and converts the existing implementation for the X8 version. For the old code, now used for the X8 version, it improves the efficiency of the final summations by reducing 11 instruc

Re: [FFmpeg-devel] [PATCH v2 0/1] lavc/aarch64: add some neon pix_abs functions

2022-04-16 Thread Martin Storsjö
On Fri, 15 Apr 2022, Martin Storsjö wrote: On Thu, 14 Apr 2022, Swinney, Jonathan wrote: Thanks Martin for the review. I made some updates according to the suggestions you made. I added a checkasm function, but I'm new to the test framework, so it may need some work still. Thank

Re: [FFmpeg-devel] [PATCH 1/1] librtmp: use AVBPrint instead of char *

2022-04-16 Thread Martin Storsjö
On Fri, 15 Apr 2022, Tristan Matthews wrote: This avoids having to do one pass to calculate the full length to allocate followed by a second pass to actually append values. --- libavformat/librtmp.c | 124 +++--- 1 file changed, 33 insertions(+), 91 deletions(-

Re: [FFmpeg-devel] [PATCH v2 1/1] lavc/aarch64: add some neon pix_abs functions

2022-04-15 Thread Martin Storsjö
On Thu, 14 Apr 2022, Swinney, Jonathan wrote: - ff_pix_abs16_neon - ff_pix_abs16_xy2_neon In direct micro benchmarks of these ff functions verses their C implementations, these functions performed as follows on AWS Graviton 2: ff_pix_abs16_neon: c: benchmark ran 10 iterations in 0.955383

Re: [FFmpeg-devel] [PATCH v2 0/1] lavc/aarch64: add some neon pix_abs functions

2022-04-15 Thread Martin Storsjö
On Thu, 14 Apr 2022, Swinney, Jonathan wrote: Thanks Martin for the review. I made some updates according to the suggestions you made. I added a checkasm function, but I'm new to the test framework, so it may need some work still. Thanks for putting in the effort to make a test - that adds

Re: [FFmpeg-devel] [PATCH v1] avformat/ipfsgateway: define PATH_MAX

2022-04-14 Thread Martin Storsjö
On Thu, 14 Apr 2022, Mark Gaiser wrote: On Thu, Apr 14, 2022 at 10:25 AM Martin Storsjö wrote: On Wed, 13 Apr 2022, Mark Gaiser wrote: > On Wed, Apr 13, 2022 at 5:21 PM Mark Gaiser wrote: > >> PATH_MAX is posix. Some compilers (MSVC) don't define this >> thus

Re: [FFmpeg-devel] [PATCH v1] avformat/ipfsgateway: define PATH_MAX

2022-04-14 Thread Martin Storsjö
On Wed, 13 Apr 2022, Mark Gaiser wrote: On Wed, Apr 13, 2022 at 5:21 PM Mark Gaiser wrote: PATH_MAX is posix. Some compilers (MSVC) don't define this thus failing to compile the ipfsgateway file. Defining it fixes the compile. Signed-off-by: Mark Gaiser --- libavformat/ipfsgateway.c | 6 ++

Re: [FFmpeg-devel] [PATCH 1/1] librtmp: use AVBPrint instead of char *

2022-04-13 Thread Martin Storsjö
On Wed, 13 Apr 2022, Marton Balint wrote: On Wed, 13 Apr 2022, Martin Storsjö wrote: On Mon, 11 Apr 2022, Tristan Matthews wrote: This avoids having to do one pass to calculate the full length to allocate followed by a second pass to actually append values. --- libavformat/librtmp.c

Re: [FFmpeg-devel] [PATCH 1/1] librtmp: use AVBPrint instead of char *

2022-04-13 Thread Martin Storsjö
On Mon, 11 Apr 2022, Tristan Matthews wrote: This avoids having to do one pass to calculate the full length to allocate followed by a second pass to actually append values. --- libavformat/librtmp.c | 123 +++--- 1 file changed, 32 insertions(+), 91 deletions(-

Re: [FFmpeg-devel] [PATCH 4/4] fate/oma: Use REMUX where appropriate

2022-04-13 Thread Martin Storsjö
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote: Simplifies the checks. Signed-off-by: Andreas Rheinhardt --- tests/fate/oma.mak | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tests/fate/oma.mak b/tests/fate/oma.mak index a088feff21..7e2020b7d0 100644 --- a/tests/f

Re: [FFmpeg-devel] [PATCH 3/4] fate/subtitles: Use REMUX where appropriate

2022-04-13 Thread Martin Storsjö
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote: It also adds the missing depenencies on the file and pipe protocols and the framecrc muxer. Signed-off-by: Andreas Rheinhardt --- tests/fate/subtitles.mak | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/fate/subtitles.mak

Re: [FFmpeg-devel] [PATCH 2/4] fate/image: Use TRANSCODE where appropriate

2022-04-13 Thread Martin Storsjö
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote: This also adds previously forgotten requirements. E.g. fate-jpg-icc actually depends on the png decoder, so that it should not be run when e.g. zlib is disabled, yet it happens, see http://fate.ffmpeg.org/report.cgi?time=20220411182746&slot=x86_64-a

Re: [FFmpeg-devel] [PATCH 1/4] tests/Makefile: Add auxiliary functions for transcode and stream_remux

2022-04-13 Thread Martin Storsjö
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote: Tests using the transcode and stream_remux functions have some common requirements (namely the file and pipe protocols as well as the framecrc muxer) and also other commonalities: The create a file and read it immediately afterwards, so that they ty

Re: [FFmpeg-devel] [PATCH v3 00/10] avcodec/vc1: Arm optimisations

2022-04-01 Thread Martin Storsjö
On Fri, 1 Apr 2022, Martin Storsjö wrote: On Thu, 31 Mar 2022, Ben Avison wrote: The VC1 decoder was missing lots of important fast paths for Arm, especially for 64-bit Arm. This submission fills in implementations for all functions where a fast path already existed and the fallback C

Re: [FFmpeg-devel] [PATCH v3 00/10] avcodec/vc1: Arm optimisations

2022-03-31 Thread Martin Storsjö
On Thu, 31 Mar 2022, Ben Avison wrote: The VC1 decoder was missing lots of important fast paths for Arm, especially for 64-bit Arm. This submission fills in implementations for all functions where a fast path already existed and the fallback C implementation was taking 1% or more of the runtime,

Re: [FFmpeg-devel] [PATCH 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths

2022-03-31 Thread Martin Storsjö
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 15:14, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: +// Clamp 16-bit signed block coefficients to signed 8-bit (biased by 128) +// On entry: +//   x0 -> array of 64x 16-bit coefficients +//   x1 -> 8-bit results +/

Re: [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths

2022-03-31 Thread Martin Storsjö
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 14:49, Martin Storsjö wrote: Looks generally reasonable. Is it possible to factorize out the individual transforms (so that you'd e.g. invoke the same macro twice in the 8x8 and 4x4 functions) without too much loss? There is a

Re: [FFmpeg-devel] [PATCH 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths

2022-03-31 Thread Martin Storsjö
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 13:35, Martin Storsjö wrote: Overall, the code looks sensible to me. Would it make sense to share the core of the filter between the horizontal/vertical cases with e.g. a macro? (I didn't check in detail if there's much differen

Re: [FFmpeg-devel] [PATCH 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer

2022-03-31 Thread Martin Storsjö
On Thu, 31 Mar 2022, Ben Avison wrote: On 29/03/2022 21:37, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: As with the rest of the checkasm tests - please unmacro most things where possible (except for the RANDOMIZE_* macros, those are ok to keep macroed if you want to). In

Re: [FFmpeg-devel] [PATCH 10/10] avcodec/vc1: Arm 32-bit NEON unescape fast path

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 918624.7 vc1dsp.vc1_unescape_buffer_neon: 142958.0 Signed-off-by: Ben Avison --- libavcodec/arm/vc1dsp_init_neon.c | 61 +++ libavcodec/arm/vc1dsp_neon.S

Re: [FFmpeg-devel] [PATCH 09/10] avcodec/vc1: Arm 64-bit NEON unescape fast path

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 655617.7 vc1dsp.vc1_unescape_buffer_neon: 118237.0 Signed-off-by: Ben Avison --- libavcodec/aarch64/vc1dsp_init_aarch64.c | 61 libavcodec/aarch64/vc1dsp_neo

Re: [FFmpeg-devel] [PATCH 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. idctdsp.add_pixels_clamped_c: 323.0 idctdsp.add_pixels_clamped_neon: 41.5 idctdsp.put_pixels_clamped_c: 243.0 idctdsp.put_pixels_clamped_neon: 30.0 idctdsp.put_signed_pixels_clamped_c: 225.7 idctdsp

Re: [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths

2022-03-30 Thread Martin Storsjö
On Wed, 30 Mar 2022, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_inv_trans_4x4_c: 158.2 vc1dsp.vc1_inv_trans_4x4_neon: 65.7 vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5 vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5

Re: [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_inv_trans_4x4_c: 158.2 vc1dsp.vc1_inv_trans_4x4_neon: 65.7 vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5 vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5 vc1dsp.vc1_inv_trans_4x8_c: 335.2 vc1dsp.vc1_inv_tran

Re: [FFmpeg-devel] [PATCH 06/10] avcodec/vc1: Arm 32-bit NEON deblocking filter fast paths

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the ti

Re: [FFmpeg-devel] [PATCH 06/10] avcodec/vc1: Arm 32-bit NEON deblocking filter fast paths

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the ti

Re: [FFmpeg-devel] [PATCH 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths

2022-03-30 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the ti

Re: [FFmpeg-devel] [PATCH] test: tiny_ssim: Don't include config.h

2022-03-30 Thread Martin Storsjö
On Sun, 27 Mar 2022, Martin Storsjö wrote: tiny_ssim is built for the build host, not for the target platform. Therefore, it mustn't include the config.h header, which is set up specifically for the target platform and compiler. This fixes cross building for older WinStore platforms,

Re: [FFmpeg-devel] [PATCH] vc1dsp: Change remaining stride parameters to ptrdiff_t

2022-03-30 Thread Martin Storsjö
On Tue, 29 Mar 2022, Ben Avison wrote: On 29/03/2022 13:44, Martin Storsjö wrote: The existing x86 assembly for loop filters uses the stride as a full register without clearing/sign extending the upper half of the registers on x86_64. This avoids crashes if the caller would have passed

[FFmpeg-devel] [PATCH v2] vc1dsp: Change remaining stride parameters to ptrdiff_t

2022-03-29 Thread Martin Storsjö
: Martin Storsjö --- Updated function signatures in the mips code too, updated the left_stride/right_stride parameters in the vc1_h_s_overlap function too, updated the comments in the x86 assembly. --- libavcodec/mips/vc1dsp_mips.h| 20 ++-- libavcodec/mips/vc1dsp_mmi.c

Re: [FFmpeg-devel] [PATCH 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer

2022-03-29 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: void ff_vc1dsp_init(VC1DSPContext* c); diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c index 0823ccad31..0ab5892403 100644 --- a/tests/checkasm/vc1dsp.c +++ b/tests/checkasm/vc1dsp.c @@ -286,6 +286,20 @@ static matrix *generate_inverse_quant

Re: [FFmpeg-devel] [PATCH 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests

2022-03-29 Thread Martin Storsjö
On Tue, 29 Mar 2022, Ben Avison wrote: Thirdly - the added test also occasionally fails for the other existing functions (armv6, neon) and the newly added aarch64 neon version. If you have e.g. src[] = 32767, dst[] = 255, then the widening 8->16 addition will overflow, as there's no operation

Re: [FFmpeg-devel] [PATCH 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests

2022-03-29 Thread Martin Storsjö
On Tue, 29 Mar 2022, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: Disable ff_add_pixels_clamped_arm, which was found to fail the test. As this is normally only used for Arms prior to Armv6 (ARM11) it seems quite unlikely that anyone is still using this, so I haven't p

Re: [FFmpeg-devel] [PATCH 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests

2022-03-29 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: Disable ff_add_pixels_clamped_arm, which was found to fail the test. As this is normally only used for Arms prior to Armv6 (ARM11) it seems quite unlikely that anyone is still using this, so I haven't put in the effort to debug it. I had a look at this fu

[FFmpeg-devel] [PATCH] vc1dsp: Change remaining stride parameters to ptrdiff_t

2022-03-29 Thread Martin Storsjö
: Martin Storsjö --- libavcodec/vc1dsp.c | 20 ++-- libavcodec/vc1dsp.h | 16 libavcodec/x86/vc1dsp_init.c | 16 3 files changed, 26 insertions(+), 26 deletions(-) diff --git a/libavcodec/vc1dsp.c b/libavcodec/vc1dsp.c index

Re: [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests

2022-03-29 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real

Re: [FFmpeg-devel] [PATCH 02/10] checkasm: Add vc1dsp inverse transform tests

2022-03-29 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: This test deliberately doesn't exercise the full range of inputs described in the committee draft VC-1 standard. It says: input coefficients in frequency domain, D, satisfy -2048 <= D < 2047 intermediate coefficients, E, satisfy-4096 <= E

Re: [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests

2022-03-29 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real

Re: [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests

2022-03-29 Thread Martin Storsjö
On Mon, 28 Mar 2022, Ben Avison wrote: On 25/03/2022 22:53, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: +#define CHECK_LOOP_FILTER(func) \ +    do {    \ +    if

[FFmpeg-devel] [PATCH] test: tiny_ssim: Don't include config.h

2022-03-26 Thread Martin Storsjö
nv(x) NULL". Signed-off-by: Martin Storsjö --- tests/tiny_ssim.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tests/tiny_ssim.c b/tests/tiny_ssim.c index 08f8e92a03..9740652288 100644 --- a/tests/tiny_ssim.c +++ b/tests/tiny_ssim.c @@ -27,7 +27,6 @@ * overlapped 8x8 block sums, rather th

Re: [FFmpeg-devel] [GAS-PP PATCH] Handle the aarch64 tbnz intruction in the same way as tbz, for armasm64

2022-03-25 Thread Martin Storsjö
On Mon, 21 Mar 2022, Martin Storsjö wrote: --- I'll apply in a couple days if there's no comments. --- gas-preprocessor.pl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Pushed. // Martin ___ ffmpeg-devel mailing list ff

Re: [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests

2022-03-25 Thread Martin Storsjö
On Fri, 25 Mar 2022, Ben Avison wrote: Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real

Re: [FFmpeg-devel] [PATCH] rtpenc_vp8: Use 15-bit PictureIDs

2022-03-25 Thread Martin Storsjö
On Tue, 22 Mar 2022, ke...@muxable.com wrote: From: Kevin Wang 7-bit PictureIDs are not supported by WebRTC: https://groups.google.com/g/discuss-webrtc/c/333-L02vuWA In practice, 15-bit PictureIDs offer better compatibility. Signed-off-by: Kevin Wang --- libavformat/rtpenc_vp8.c | 3 ++- 1 f

<    3   4   5   6   7   8   9   10   11   12   >