On Wed, 25 May 2022, Swinney, Jonathan wrote:
This patch adds code to support specializations of the hscale function and adds
a specialization for filterSize == 4.
ff_hscale8to15_4_neon is a complete rewrite. Since the main bottleneck here is
loading the data from src, this data is loaded a who
On Wed, 25 May 2022, Swinney, Jonathan wrote:
Signed-off-by: Jonathan Swinney
---
tests/checkasm/sw_scale.c | 38 ++
1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c
index 3c0a083b42..6c223c4
On Wed, 25 May 2022, Swinney, Jonathan wrote:
This is a resubmission of changes to the hscale function for aarch64. I
added a test as a separate patch so that it would be easier to get
consistent before and after performance data. After Martin already
submitted the improvement to the final sec
This codepath is enabled by default on arm, if the linux perf API
is available, unless disabled with --disable-linux-perf.
---
tests/checkasm/checkasm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index a86db140e3..7f
On Wed, 25 May 2022, J. Dekker wrote:
On 24 May 2022, at 22:27, Martin Storsjö wrote:
On Tue, 17 May 2022, J. Dekker wrote:
The HEVC decoder can call these functions with smaller widths than the
functions themselves are designed to operate on so we should only check
the relevant output
On Tue, 24 May 2022, ffmpegagent wrote:
This patchset adds support for long file and directory paths on Windows. The
implementation follows the same logic that .NET is using internally, with
the only exception that it doesn't expand short path components in 8.3
format. .NET does this as the same
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: Martin Storsjö
Sent: Tuesday, May 24, 2022 10:59 PM
To: softworkz
Cc: ffmpeg-devel@ffmpeg.org; Soft Works ; Hendrik
Leppkes
Subject: Re: [PATCH v6 2/2] avformat/os_support: Support long file names
on Windows
On Tue, 24 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
libavformat/os_support.h | 87 +---
1 file changed, 63 insertions(+), 24 deletions(-)
diff --git a/libavformat/os_support.h b/libavformat/os_support.h
index 5e6b32d2dc..179b9
On Tue, 24 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
libavutil/file_open.c | 2 +-
libavutil/wchar_filename.h | 180 +
2 files changed, 181 insertions(+), 1 deletion(-)
This looks ok to me now, thanks!
// Martin
__
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of Martin
Storsjö
Sent: Tuesday, May 24, 2022 10:22 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8
On Tue, 24 May 2022
On Tue, 17 May 2022, J. Dekker wrote:
The HEVC decoder can call these functions with smaller widths than the
functions themselves are designed to operate on so we should only check
the relevant output
Signed-off-by: J. Dekker
---
tests/checkasm/hevc_sao.c | 51 -
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of Martin
Storsjö
Sent: Tuesday, May 24, 2022 11:29 AM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8
On Mon, 23 May 2022
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: Martin Storsjö
Sent: Tuesday, May 24, 2022 1:26 PM
To: Soft Works
Cc: FFmpeg development discussions and patches ;
Hendrik Leppkes
Subject: RE: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support
long file names
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: Martin Storsjö
Sent: Tuesday, May 24, 2022 12:26 PM
To: Soft Works
Cc: FFmpeg development discussions and patches ;
Hendrik Leppkes
Subject: RE: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support
long file names
On Tue, 24 May 2022, Soft Works wrote:
-Original Message-
From: Martin Storsjö
Sent: Tuesday, May 24, 2022 11:23 AM
To: FFmpeg development discussions and patches
Cc: softworkz ; Hendrik Leppkes
Subject: Re: [FFmpeg-devel] [PATCH v5 2/2] avformat/os_support: Support
long file names
On Mon, 23 May 2022, Soft Works wrote:
Great. I rebased and resubmitted both patchsets. The primary long-path
patchset didn't need any change.
Considerations for the latter were:
- Should the file wchar_filename.h be renamed as it is now containing
the path prefixing code?
I guess we could
On Tue, 24 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
libavformat/os_support.h | 16 +++-
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/libavformat/os_support.h b/libavformat/os_support.h
index 5e6b32d2dc..d4c07803a5 100644
--- a/libavf
On Tue, 24 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
libavutil/file_open.c | 2 +-
libavutil/wchar_filename.h | 166 +
2 files changed, 167 insertions(+), 1 deletion(-)
diff --git a/libavutil/file_open.c b/libavutil/file_
On Mon, 23 May 2022, ffmpegagent wrote:
Unify file access operations by replacing usages of direct calls to posix
fopen()
v2: Remove changes to fftools for now
v3: Add some additional replacements
v4: Fix and improve commit messages
v5: Add patch to remap ff_open in libavfilter for MSVC on Wind
, avcodec will be linked directly against MFPlat.DLL.
- MediaFoundation functions are now called like MFTEnumEx, like Martin
Storsjö suggested in his review of the v3.
I forgot to mention it on earlier versions, this patch addresses
https://trac.ffmpeg.org/ticket/9788.
diff --git a/libavcodec
On Mon, 23 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of Martin
Storsjö
Sent: Monday, May 23, 2022 12:58 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8
On Mon, 23 May 2022
On Mon, 23 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of Martin
Storsjö
Sent: Monday, May 23, 2022 12:53 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8
On Sat, 21 May 2022
On Sat, 21 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of Martin
Storsjö
Sent: Friday, May 20, 2022 11:13 PM
To: ffmpeg-devel@ffmpeg.org
Subject: [FFmpeg-devel] [PATCH 1/3] fftools: Stop using av_fopen_utf8
Provide a header based inline reimplementation
---
libavfilter/af_arnndn.c | 2 +-
libavfilter/opencl.c | 2 +-
libavfilter/vf_curves.c | 2 +-
libavfilter/vf_dnn_classify.c | 2 +-
libavfilter/vf_dnn_detect.c | 2 +-
libavfilter/vf_fieldhint.c| 2 +-
libavfilter/vf_lut3d.c| 4 ++--
libavfilter/vf_nnedi.c
Since every DLL can use an individual CRT on Windows, having
an exported function that opens a FILE* won't work if that
FILE* is going to be used from a different DLL (or from user
application code).
Internally within the libraries, the issue can be worked around
by duplicating the function in all
Provide a header based inline reimplementation of it.
Using av_fopen_utf8 doesn't work outside of the libraries when built
with MSVC as shared libraries (in the default configuration, where
each DLL gets a separate statically linked CRT).
---
fftools/ffmpeg_opt.c | 3 +-
fftools/fopen_utf8.h | 7
On Fri, 20 May 2022, Trystan Mata wrote:
From 2bdef1bdb93efa40b7d3fe21270f9f23465bee90 Mon Sep 17 00:00:00 2001
From: Trystan Mata
Date: Fri, 20 May 2022 14:26:49 +0200
Subject: [PATCH] avcodec/mfenc: Dynamically load MFPlat.DLL
Allow builds of FFmpeg with MediaFoundation to work under N editi
On Fri, 20 May 2022, Derek Buitenhuis wrote:
On 5/20/2022 5:37 PM, Soft Works wrote:
But if Matt's patch would be agreeable, then that would surely be
the best outcome.
I can rebase and resubmit his patch if you would find it agreeable.
Ah - that was not clear to me.
If Ubuntu LTS does inde
On Mon, 9 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
use av_fopen_utf8() instead of plain fopen()
Unify file access operations by replacing usages of direct calls to
posix fopen()
v2: Remove changes to fftools for now
Published-As:
https://github.c
On Mon, 9 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of
Martin Storsjö
Sent: Monday, May 9, 2022 11:42 AM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object
sharing issue on Windows
On Mon
On Mon, 9 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of
Martin Storsjö
Sent: Sunday, May 8, 2022 10:12 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object
sharing issue on Windows
On
On Mon, 9 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of
Martin Storsjö
Sent: Sunday, May 8, 2022 10:02 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object
sharing issue on Windows
On Sat
On Sat, 7 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of
Andreas Rheinhardt
Sent: Saturday, May 7, 2022 6:32 AM
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object
sharing issue on Windows
Soft Works:
-O
On Sat, 7 May 2022, softworkz wrote:
From: softworkz
Signed-off-by: softworkz
---
fftools/cmdutils.c | 6 +++---
fftools/ffmpeg.c | 4 ++--
fftools/opt_common.c | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
Just for clarity (for someone looking at this individual mail thread o
On Sat, 7 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of
Martin Storsjö
Sent: Wednesday, April 20, 2022 2:48 PM
To: ffmpeg-devel@ffmpeg.org
Subject: [FFmpeg-devel] av_fopen_utf8 and cross-DLL CRT object sharing
issue on Windows
Hi,
I just became aware
On Sat, 7 May 2022, Soft Works wrote:
-Original Message-
From: ffmpeg-devel On Behalf Of nil-
admir...@mailo.com
Sent: Friday, May 6, 2022 6:08 PM
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v11 1/6]
libavutil/wchar_filename.h: Add whcartoutf8, wchartoansi and
utf8toa
On Fri, 6 May 2022, Andreas Rheinhardt wrote:
This avoids having to rebuild big files every time FFMPEG_VERSION
changes (which it does with every commit).
Signed-off-by: Andreas Rheinhardt
---
Makefile | 4 ++-
ffbuild/common.mak | 2 --
libavcodec/Makefile|
On Mon, 2 May 2022, David Fletcher wrote:
On 2/5/2022, "Nicolas George" wrote:
Is there a trac ticket? If not, please fill one: we would not want to
keep that bug.
Regards,
--
Nicolas George
Hi Nicolas,
I'll prepare a test case to demonstrate the issue and fill in a ticket.
As far as I c
On Thu, 28 Apr 2022, J. Dekker wrote:
bench on AWS Graviton:
hevc_sao_edge_8x8_8_c: 516.0
hevc_sao_edge_8x8_8_neon: 81.0
Signed-off-by: J. Dekker
---
libavcodec/aarch64/hevcdsp_init_aarch64.c | 3 ++
libavcodec/aarch64/hevcdsp_sao_neon.S | 51 +++
2 files changed, 54 in
On Wed, 20 Apr 2022, Martin Storsjö wrote:
On Fri, 18 Feb 2022, Andreas Schneider wrote:
Signed-off-by: Andreas Schneider
---
libavcodec/libopenh264dec.c | 2 +-
libavcodec/libopenh264enc.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/libopenh264dec.c b
On Mon, 25 Apr 2022, Andre Kempe wrote:
This patch fixes a wrong type of BTI landing pad when branching to
functions instantiated via the fft*_neon macro.
Although the previously employed paciasp instruction serves as a landing
pad, for the ways that this function is invoked it is the wrong typ
On Mon, 25 Apr 2022, Hendrik Leppkes wrote:
On Mon, Apr 25, 2022 at 1:12 PM Soft Works wrote:
From my point of view:
ffmpeg is already working pretty well in handling long file paths (also with
Unicode characters) when pre-fixing paths with \\?\, and this is working
on all Windows versions wi
On Thu, 21 Apr 2022, Swinney, Jonathan wrote:
Thanks for making this improvement. I will rebase my patches on your change. I
also measured the performance on AWS Graviton 2 and 3. I added the numbers to
your table.
Before: Cortex A53 A72 A73 Graviton 2 Graviton
Hi,
I just became aware of the av_fopen_utf8 function - which was introduced
to fix path name translations on Windows - actually has a notable design
flaw.
Background:
On Windows, a process can contain more than one C runtime (CRT); the
system comes with two shared ones (UCRT and msvcrt.dl
On Fri, 18 Feb 2022, Andreas Schneider wrote:
Signed-off-by: Andreas Schneider
---
libavcodec/libopenh264dec.c | 2 +-
libavcodec/libopenh264enc.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c
index 7f5e85402a..97d
On Fri, 15 Apr 2022, Nil Admirari wrote:
---
fftools/fftools.manifest | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fftools/fftools.manifest b/fftools/fftools.manifest
index 30b7d8fe..d1ac1e4e 100644
--- a/fftools/fftools.manifest
+++ b/fftools/fftools.manifest
@@ -3,8 +3
On Fri, 15 Apr 2022, Nil Admirari wrote:
---
fftools/Makefile | 5 +
fftools/fftools.manifest | 10 ++
fftools/manifest.rc | 3 +++
3 files changed, 18 insertions(+)
create mode 100644 fftools/fftools.manifest
create mode 100644 fftools/manifest.rc
I think the change he
On Fri, 15 Apr 2022, Nil Admirari wrote:
---
fftools/cmdutils.c | 38 +-
1 file changed, 29 insertions(+), 9 deletions(-)
diff --git a/fftools/cmdutils.c b/fftools/cmdutils.c
index 5d7cdc3e..a66dbb22 100644
--- a/fftools/cmdutils.c
+++ b/fftools/cmdutils.c
@@
On Fri, 15 Apr 2022, Nil Admirari wrote:
---
compat/w32dlfcn.h | 78 ++-
1 file changed, 64 insertions(+), 14 deletions(-)
diff --git a/compat/w32dlfcn.h b/compat/w32dlfcn.h
index 52a94efa..0f41f50b 100644
--- a/compat/w32dlfcn.h
+++ b/compat/w32dlfcn.
On Fri, 15 Apr 2022, Nil Admirari wrote:
---
libavformat/avisynth.c | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/libavformat/avisynth.c b/libavformat/avisynth.c
index 8ba2bdea..f7bea8c3 100644
--- a/libavformat/avisynth.c
+++ b/libavformat/avisynth.c
@@ -34,6 +
On Fri, 15 Apr 2022, Nil Admirari wrote:
These functions are going to be used in libavformat/avisynth.c
and fftools/cmdutils.c remove MAX_PATH limit.
---
libavutil/wchar_filename.h | 51 ++
1 file changed, 51 insertions(+)
I looked through this patchset now,
, around 3-8% for the smaller filter sizes.
Inspired by a patch by Jonathan Swinney .
Signed-off-by: Martin Storsjö
---
I'll go ahead and apply this patch within a few days if there's no
opposition, as it should be a fairly uncontroversial change.
---
libswscale/aarch64/hscale.S | 14 +++
On Sun, 17 Apr 2022, Martin Storsjö wrote:
On Fri, 15 Apr 2022, Swinney, Jonathan wrote:
This patch adds specializations for hscale for filterSize == 4 and 8 and
converts the existing implementation for the X8 version. For the old code,
now
used for the X8 version, it improves the efficiency
On Fri, 15 Apr 2022, Swinney, Jonathan wrote:
This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By
using specialized code with unrolling to match the filterSize we can improve
performance.
| (seconds) | c6g | | |
| | - | - | - |
|
On Tue, 19 Apr 2022, Marton Balint wrote:
On Sat, 16 Apr 2022, Martin Storsjö wrote:
On Fri, 15 Apr 2022, Tristan Matthews wrote:
This avoids having to do one pass to calculate the full length to
allocate
followed by a second pass to actually append values.
---
libavformat/librtmp.c
On Mon, 18 Apr 2022, Timo Rothenpieler wrote:
ffmpeg | branch: master | Timo Rothenpieler | Thu Apr
7 20:11:24 2022 +0200| [d5687236aba6fd31dd4369c290df9a5b1192e43e] | committer: Timo
Rothenpieler
doc: install css files along html docs
http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=comm
On Fri, 15 Apr 2022, Swinney, Jonathan wrote:
This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By
using specialized code with unrolling to match the filterSize we can improve
performance.
| (seconds) | c6g | | |
| | - | - | - |
|
On Fri, 15 Apr 2022, Swinney, Jonathan wrote:
This patch adds specializations for hscale for filterSize == 4 and 8 and
converts the existing implementation for the X8 version. For the old code, now
used for the X8 version, it improves the efficiency of the final summations by
reducing 11 instruc
On Fri, 15 Apr 2022, Martin Storsjö wrote:
On Thu, 14 Apr 2022, Swinney, Jonathan wrote:
Thanks Martin for the review. I made some updates according to the
suggestions you made.
I added a checkasm function, but I'm new to the test framework, so it may
need some work still.
Thank
On Fri, 15 Apr 2022, Tristan Matthews wrote:
This avoids having to do one pass to calculate the full length to allocate
followed by a second pass to actually append values.
---
libavformat/librtmp.c | 124 +++---
1 file changed, 33 insertions(+), 91 deletions(-
On Thu, 14 Apr 2022, Swinney, Jonathan wrote:
- ff_pix_abs16_neon
- ff_pix_abs16_xy2_neon
In direct micro benchmarks of these ff functions verses their C implementations,
these functions performed as follows on AWS Graviton 2:
ff_pix_abs16_neon:
c: benchmark ran 10 iterations in 0.955383
On Thu, 14 Apr 2022, Swinney, Jonathan wrote:
Thanks Martin for the review. I made some updates according to the
suggestions you made.
I added a checkasm function, but I'm new to the test framework, so it
may need some work still.
Thanks for putting in the effort to make a test - that adds
On Thu, 14 Apr 2022, Mark Gaiser wrote:
On Thu, Apr 14, 2022 at 10:25 AM Martin Storsjö wrote:
On Wed, 13 Apr 2022, Mark Gaiser wrote:
> On Wed, Apr 13, 2022 at 5:21 PM Mark Gaiser wrote:
>
>> PATH_MAX is posix. Some compilers (MSVC) don't define this
>> thus
On Wed, 13 Apr 2022, Mark Gaiser wrote:
On Wed, Apr 13, 2022 at 5:21 PM Mark Gaiser wrote:
PATH_MAX is posix. Some compilers (MSVC) don't define this
thus failing to compile the ipfsgateway file.
Defining it fixes the compile.
Signed-off-by: Mark Gaiser
---
libavformat/ipfsgateway.c | 6 ++
On Wed, 13 Apr 2022, Marton Balint wrote:
On Wed, 13 Apr 2022, Martin Storsjö wrote:
On Mon, 11 Apr 2022, Tristan Matthews wrote:
This avoids having to do one pass to calculate the full length to
allocate
followed by a second pass to actually append values.
---
libavformat/librtmp.c
On Mon, 11 Apr 2022, Tristan Matthews wrote:
This avoids having to do one pass to calculate the full length to allocate
followed by a second pass to actually append values.
---
libavformat/librtmp.c | 123 +++---
1 file changed, 32 insertions(+), 91 deletions(-
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote:
Simplifies the checks.
Signed-off-by: Andreas Rheinhardt
---
tests/fate/oma.mak | 10 ++
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/tests/fate/oma.mak b/tests/fate/oma.mak
index a088feff21..7e2020b7d0 100644
--- a/tests/f
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote:
It also adds the missing depenencies on the file and pipe protocols
and the framecrc muxer.
Signed-off-by: Andreas Rheinhardt
---
tests/fate/subtitles.mak | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/fate/subtitles.mak
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote:
This also adds previously forgotten requirements. E.g. fate-jpg-icc
actually depends on the png decoder, so that it should not be run
when e.g. zlib is disabled, yet it happens, see
http://fate.ffmpeg.org/report.cgi?time=20220411182746&slot=x86_64-a
On Tue, 12 Apr 2022, Andreas Rheinhardt wrote:
Tests using the transcode and stream_remux functions have some common
requirements (namely the file and pipe protocols as well as the framecrc
muxer) and also other commonalities: The create a file and read it
immediately afterwards, so that they ty
On Fri, 1 Apr 2022, Martin Storsjö wrote:
On Thu, 31 Mar 2022, Ben Avison wrote:
The VC1 decoder was missing lots of important fast paths for Arm,
especially
for 64-bit Arm. This submission fills in implementations for all functions
where a fast path already existed and the fallback C
On Thu, 31 Mar 2022, Ben Avison wrote:
The VC1 decoder was missing lots of important fast paths for Arm, especially
for 64-bit Arm. This submission fills in implementations for all functions
where a fast path already existed and the fallback C implementation was
taking 1% or more of the runtime,
On Thu, 31 Mar 2022, Ben Avison wrote:
On 30/03/2022 15:14, Martin Storsjö wrote:
On Fri, 25 Mar 2022, Ben Avison wrote:
+// Clamp 16-bit signed block coefficients to signed 8-bit (biased by 128)
+// On entry:
+// x0 -> array of 64x 16-bit coefficients
+// x1 -> 8-bit results
+/
On Thu, 31 Mar 2022, Ben Avison wrote:
On 30/03/2022 14:49, Martin Storsjö wrote:
Looks generally reasonable. Is it possible to factorize out the individual
transforms (so that you'd e.g. invoke the same macro twice in the 8x8 and
4x4 functions) without too much loss?
There is a
On Thu, 31 Mar 2022, Ben Avison wrote:
On 30/03/2022 13:35, Martin Storsjö wrote:
Overall, the code looks sensible to me.
Would it make sense to share the core of the filter between the
horizontal/vertical cases with e.g. a macro? (I didn't check in detail if
there's much differen
On Thu, 31 Mar 2022, Ben Avison wrote:
On 29/03/2022 21:37, Martin Storsjö wrote:
On Fri, 25 Mar 2022, Ben Avison wrote:
As with the rest of the checkasm tests - please unmacro most things where
possible (except for the RANDOMIZE_* macros, those are ok to keep macroed
if you want to).
In
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_unescape_buffer_c: 918624.7
vc1dsp.vc1_unescape_buffer_neon: 142958.0
Signed-off-by: Ben Avison
---
libavcodec/arm/vc1dsp_init_neon.c | 61 +++
libavcodec/arm/vc1dsp_neon.S
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_unescape_buffer_c: 655617.7
vc1dsp.vc1_unescape_buffer_neon: 118237.0
Signed-off-by: Ben Avison
---
libavcodec/aarch64/vc1dsp_init_aarch64.c | 61
libavcodec/aarch64/vc1dsp_neo
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
idctdsp.add_pixels_clamped_c: 323.0
idctdsp.add_pixels_clamped_neon: 41.5
idctdsp.put_pixels_clamped_c: 243.0
idctdsp.put_pixels_clamped_neon: 30.0
idctdsp.put_signed_pixels_clamped_c: 225.7
idctdsp
On Wed, 30 Mar 2022, Martin Storsjö wrote:
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_inv_trans_4x4_c: 158.2
vc1dsp.vc1_inv_trans_4x4_neon: 65.7
vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5
vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_inv_trans_4x4_c: 158.2
vc1dsp.vc1_inv_trans_4x4_neon: 65.7
vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5
vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5
vc1dsp.vc1_inv_trans_4x8_c: 335.2
vc1dsp.vc1_inv_tran
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the ti
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the ti
On Fri, 25 Mar 2022, Ben Avison wrote:
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the ti
On Sun, 27 Mar 2022, Martin Storsjö wrote:
tiny_ssim is built for the build host, not for the target platform.
Therefore, it mustn't include the config.h header, which is set up
specifically for the target platform and compiler.
This fixes cross building for older WinStore platforms,
On Tue, 29 Mar 2022, Ben Avison wrote:
On 29/03/2022 13:44, Martin Storsjö wrote:
The existing x86 assembly for loop filters uses the stride as a
full register without clearing/sign extending the upper half
of the registers on x86_64.
This avoids crashes if the caller would have passed
: Martin Storsjö
---
Updated function signatures in the mips code too, updated the
left_stride/right_stride parameters in the vc1_h_s_overlap
function too, updated the comments in the x86 assembly.
---
libavcodec/mips/vc1dsp_mips.h| 20 ++--
libavcodec/mips/vc1dsp_mmi.c
On Fri, 25 Mar 2022, Ben Avison wrote:
void ff_vc1dsp_init(VC1DSPContext* c);
diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c
index 0823ccad31..0ab5892403 100644
--- a/tests/checkasm/vc1dsp.c
+++ b/tests/checkasm/vc1dsp.c
@@ -286,6 +286,20 @@ static matrix
*generate_inverse_quant
On Tue, 29 Mar 2022, Ben Avison wrote:
Thirdly - the added test also occasionally fails for the other existing
functions (armv6, neon) and the newly added aarch64 neon version. If you
have e.g. src[] = 32767, dst[] = 255, then the widening 8->16 addition
will overflow, as there's no operation
On Tue, 29 Mar 2022, Martin Storsjö wrote:
On Fri, 25 Mar 2022, Ben Avison wrote:
Disable ff_add_pixels_clamped_arm, which was found to fail the test. As
this
is normally only used for Arms prior to Armv6 (ARM11) it seems quite
unlikely
that anyone is still using this, so I haven't p
On Fri, 25 Mar 2022, Ben Avison wrote:
Disable ff_add_pixels_clamped_arm, which was found to fail the test. As this
is normally only used for Arms prior to Armv6 (ARM11) it seems quite unlikely
that anyone is still using this, so I haven't put in the effort to debug it.
I had a look at this fu
: Martin Storsjö
---
libavcodec/vc1dsp.c | 20 ++--
libavcodec/vc1dsp.h | 16
libavcodec/x86/vc1dsp_init.c | 16
3 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/libavcodec/vc1dsp.c b/libavcodec/vc1dsp.c
index
On Fri, 25 Mar 2022, Ben Avison wrote:
Note that the benchmarking results for these functions are highly dependent
upon the input data. Therefore, each function is benchmarked twice,
corresponding to the best and worst case complexity of the reference C
implementation. The performance of a real
On Fri, 25 Mar 2022, Ben Avison wrote:
This test deliberately doesn't exercise the full range of inputs described in
the committee draft VC-1 standard. It says:
input coefficients in frequency domain, D, satisfy -2048 <= D < 2047
intermediate coefficients, E, satisfy-4096 <= E
On Fri, 25 Mar 2022, Ben Avison wrote:
Note that the benchmarking results for these functions are highly dependent
upon the input data. Therefore, each function is benchmarked twice,
corresponding to the best and worst case complexity of the reference C
implementation. The performance of a real
On Mon, 28 Mar 2022, Ben Avison wrote:
On 25/03/2022 22:53, Martin Storsjö wrote:
On Fri, 25 Mar 2022, Ben Avison wrote:
+#define
CHECK_LOOP_FILTER(func) \
+ do
{ \
+ if
nv(x) NULL".
Signed-off-by: Martin Storsjö
---
tests/tiny_ssim.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tests/tiny_ssim.c b/tests/tiny_ssim.c
index 08f8e92a03..9740652288 100644
--- a/tests/tiny_ssim.c
+++ b/tests/tiny_ssim.c
@@ -27,7 +27,6 @@
* overlapped 8x8 block sums, rather th
On Mon, 21 Mar 2022, Martin Storsjö wrote:
---
I'll apply in a couple days if there's no comments.
---
gas-preprocessor.pl | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Pushed.
// Martin
___
ffmpeg-devel mailing list
ff
On Fri, 25 Mar 2022, Ben Avison wrote:
Note that the benchmarking results for these functions are highly dependent
upon the input data. Therefore, each function is benchmarked twice,
corresponding to the best and worst case complexity of the reference C
implementation. The performance of a real
On Tue, 22 Mar 2022, ke...@muxable.com wrote:
From: Kevin Wang
7-bit PictureIDs are not supported by WebRTC:
https://groups.google.com/g/discuss-webrtc/c/333-L02vuWA
In practice, 15-bit PictureIDs offer better compatibility.
Signed-off-by: Kevin Wang
---
libavformat/rtpenc_vp8.c | 3 ++-
1 f
701 - 800 of 1474 matches
Mail list logo