From: Rémi Denis-Courmont
To avoid data dependencies, this does the following unroll, which
requires one extra but probably free addition:
coeff = (b * left_weight) >> decorr_shift;
b += a;
a -= coeff;
b -= coeff;
swap(a, b);
---
libavcodec/alacdsp.c| 4 ++-
From: Rémi Denis-Courmont
---
libavcodec/riscv/alacdsp_init.c | 5 +
libavcodec/riscv/alacdsp_rvv.S | 27 +++
2 files changed, 32 insertions(+)
diff --git a/libavcodec/riscv/alacdsp_init.c b/libavcodec/riscv/alacdsp_init.c
index 37688be67b..fa8a7c8129 100644
---
From: Rémi Denis-Courmont
---
libavcodec/riscv/alacdsp_init.c | 8 +++-
libavcodec/riscv/alacdsp_rvv.S | 18 ++
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/alacdsp_init.c b/libavcodec/riscv/alacdsp_init.c
index 9ddebaa60b..37688be67b
From: Rémi Denis-Courmont
VSETVLI xd, x0, ...' has rather nonobvious semantics:
- If xd is x0, then it preserves the current vector length.
- If xd is not x0, it sets the vector length to the supported maximum.
Also somewhat confusingly, while VMV.X.S always does its thing
regardless of the
From: Rémi Denis-Courmont
---
libavcodec/riscv/bswapdsp_init.c | 5 -
libavcodec/riscv/bswapdsp_rvv.S | 17 +
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/bswapdsp_init.c b/libavcodec/riscv/bswapdsp_init.c
index c17b6b75bb..abe84ec1f7
From: Rémi Denis-Courmont
---
libavcodec/riscv/Makefile| 1 +
libavcodec/riscv/bswapdsp_init.c | 7 -
libavcodec/riscv/bswapdsp_rvv.S | 45
3 files changed, 52 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/riscv/bswapdsp_rvv.S
diff
From: Rémi Denis-Courmont
Simply taking the Zbb REV8 instruction into use in a simple loop gives
some significant savings:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 771.0
But we can also use the 64-bit REV8 as a pseudo-SIMD instruction with
just one additional shift, and one fewer load, effectively
From: Rémi Denis-Courmont
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline,
From: Rémi Denis-Courmont
This adds a variant of the postfilter for use with 256-bit vectors (or
larger). Since the function requires 160-bit logical vectors, we can
cut the group multiplier down to just one.
The different vector type is passed via register. Unfortunately,
there is no VSETIVL
From: Rémi Denis-Courmont
This is optimised for a vector size of 128-bit. Or maybe it would be
more accurate to state that this is not properly optimised for larger
vector sizes, as they would work just fine with a smaller vector group
multiplier.
---
libavcodec/opusdsp.c| 2 ++
From: Rémi Denis-Courmont
On most cases, the vector type (VTYPE) for the RISC-V Vector extension
is supplied as an immediate value, with either of the VSETVLI or
VSETIVLI instructions. There is however a third instruction VSETVL
which takes the vector type from a general purpose register. That
From: Rémi Denis-Courmont
This saves almost exactly 25% on SiFive U74.
deemphasis_c: 11536.2
deemphasis_rvf: 8654.2
---
libavcodec/opusdsp.c| 2 ++
libavcodec/opusdsp.h| 1 +
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/opusdsp_init.c | 36
From: Rémi Denis-Courmont
This is currently 64-bit only because the stack spilling code would not
assemble on RV32I (and it would corrupt s0 and s1 on RV128I, in theory).
This could be added later in the unlikely that someone wants it.
---
libswscale/riscv/rgb2rgb.c | 10 +++
From: Rémi Denis-Courmont
---
libswscale/riscv/rgb2rgb.c | 4
libswscale/riscv/rgb2rgb_rvv.S | 26 ++
2 files changed, 30 insertions(+)
diff --git a/libswscale/riscv/rgb2rgb.c b/libswscale/riscv/rgb2rgb.c
index 5654154494..32c1546827 100644
---
From: Rémi Denis-Courmont
---
libswscale/rgb2rgb.c | 2 +
libswscale/rgb2rgb.h | 1 +
libswscale/riscv/Makefile | 2 +
libswscale/riscv/rgb2rgb.c | 47
libswscale/riscv/rgb2rgb_rvv.S | 78 ++
5 files changed,
From: Rémi Denis-Courmont
---
libavcodec/riscv/pixblockdsp_init.c | 6 +-
libavcodec/riscv/pixblockdsp_rvv.S | 7 +++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/pixblockdsp_init.c
b/libavcodec/riscv/pixblockdsp_init.c
index 69dbd18918..bbda381c12
From: Rémi Denis-Courmont
---
libavcodec/riscv/pixblockdsp_init.c | 4
libavcodec/riscv/pixblockdsp_rvv.S | 16
2 files changed, 20 insertions(+)
diff --git a/libavcodec/riscv/pixblockdsp_init.c
b/libavcodec/riscv/pixblockdsp_init.c
index bbda381c12..aa39a8a665 100644
From: Rémi Denis-Courmont
---
libavcodec/riscv/Makefile | 1 +
libavcodec/riscv/pixblockdsp_init.c | 12 ++
libavcodec/riscv/pixblockdsp_rvv.S | 37 +
3 files changed, 50 insertions(+)
create mode 100644 libavcodec/riscv/pixblockdsp_rvv.S
diff
From: Rémi Denis-Courmont
---
libavcodec/riscv/idctdsp_init.c | 6 +-
libavcodec/riscv/idctdsp_rvv.S | 16
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
index 1a6add80da..58b8a6c97a 100644
From: Rémi Denis-Courmont
---
libavcodec/riscv/idctdsp_init.c | 3 +++
libavcodec/riscv/idctdsp_rvv.S | 21 +
2 files changed, 24 insertions(+)
diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
index 58b8a6c97a..e6e616a555 100644
---
From: Rémi Denis-Courmont
---
libavcodec/idctdsp.c| 2 ++
libavcodec/idctdsp.h| 2 ++
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/idctdsp_init.c | 41 +++
libavcodec/riscv/idctdsp_rvv.S | 43 +
From: Rémi Denis-Courmont
---
libavutil/riscv/cpu.h | 45 +++
1 file changed, 45 insertions(+)
create mode 100644 libavutil/riscv/cpu.h
diff --git a/libavutil/riscv/cpu.h b/libavutil/riscv/cpu.h
new file mode 100644
index 00..56035f8556
---
From: Rémi Denis-Courmont
---
tests/checkasm/sw_rgb.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tests/checkasm/sw_rgb.c b/tests/checkasm/sw_rgb.c
index 7cd815e5be..da401e8201 100644
--- a/tests/checkasm/sw_rgb.c
+++ b/tests/checkasm/sw_rgb.c
@@ -68,7 +68,7 @@
From: Rémi Denis-Courmont
---
Makefile | 2 +-
configure| 15 +++
ffbuild/arch.mak | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 61f79e27ae..1fb742f390 100644
--- a/Makefile
+++ b/Makefile
@@ -91,7 +91,7 @@
From: Rémi Denis-Courmont
Those mnemonics require the very latest binutils release at the time of
writing. These macros provide seamless backward compatibility.
---
libavutil/riscv/asm.S | 19 +++
1 file changed, 19 insertions(+)
diff --git a/libavutil/riscv/asm.S
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 4 +++
libavcodec/riscv/aacpsdsp_rvv.S | 56
2 files changed, 60 insertions(+)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index c2201ffb6a..f42baf4251
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 5 +
libavcodec/riscv/aacpsdsp_rvv.S | 35
2 files changed, 40 insertions(+)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 09f16f1041..1d36f89f6e
From: Rémi Denis-Courmont
This starts with one-time initialisation of the 26 constant factors
like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with
the scalar instruction set. While the formula can readily be vectored,
the gains would (probably) be more than lost in transfering the
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 6 +-
libavcodec/riscv/aacpsdsp_rvv.S | 17 +
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 83f6d9b16b..21fd5b8470
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 6 +-
libavcodec/riscv/aacpsdsp_rvv.S | 35
2 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index
From: Rémi Denis-Courmont
---
libavcodec/fmtconvert.c| 2 ++
libavcodec/fmtconvert.h| 1 +
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/fmtconvert_init.c | 39 ++
libavcodec/riscv/fmtconvert_rvv.S | 39
From: Rémi Denis-Courmont
---
libavcodec/aacpsdsp.h| 1 +
libavcodec/aacpsdsp_template.c | 2 ++
libavcodec/riscv/Makefile| 2 ++
libavcodec/riscv/aacpsdsp_init.c | 37
libavcodec/riscv/aacpsdsp_rvv.S | 37
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 3 +++
libavcodec/riscv/audiodsp_rvv.S | 17 +
2 files changed, 20 insertions(+)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index ac06848a82..9c9265531d 100644
---
From: Rémi Denis-Courmont
This uses the following vectorisation:
for (i = 0; i < blocksize; i++) {
ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]);
mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]);
}
---
libavcodec/riscv/Makefile | 2 ++
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 5 -
libavcodec/riscv/audiodsp_rvv.S | 19 +++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index 9c9265531d..32c3c6794d
From: Rémi Denis-Courmont
---
libavcodec/riscv/fmtconvert_init.c | 7 ++-
libavcodec/riscv/fmtconvert_rvv.S | 28
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/fmtconvert_init.c
b/libavcodec/riscv/fmtconvert_init.c
index
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
libavutil/riscv/fixed_dsp_init.c | 38 ++
libavutil/riscv/fixed_dsp_rvv.S | 40
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 8982436647..a1cd180cdc 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index a559bbb32b..8982436647 100644
---
From: Rémi Denis-Courmont
---
libavcodec/riscv/Makefile| 1 +
libavcodec/riscv/audiodsp_init.c | 9
libavcodec/riscv/audiodsp_rvv.S | 36
3 files changed, 46 insertions(+)
create mode 100644 libavcodec/riscv/audiodsp_rvv.S
diff --git
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 20
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 44a505308d..e61f887862 100644
---
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d:
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 33
2 files changed, 36 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index b99e3080c9..44a505308d
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 24 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index a1cd180cdc..b99e3080c9 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 21 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 9e19413d5d..a559bbb32b 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 29114dfb82..9e19413d5d 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +-
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 2482094ab4..29114dfb82
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +-
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 3386139d49..2482094ab4
From: Rémi Denis-Courmont
Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech):
get_pixels_c: 180.0
get_pixels_rvi: 136.7
---
libavcodec/pixblockdsp.c| 2 +
libavcodec/pixblockdsp.h| 2 +
libavcodec/riscv/Makefile | 2 +
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 ++
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 23 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index f4299049b0..3386139d49 100644
---
From: Rémi Denis-Courmont
RV64G supports MIN & MAX instructions natively only on floating point
registers, not general purpose ones. The later would require the Zbb
extension. Due to that, it is actually faster to perform the clipping
"properly" in FPU.
Benchmarks on SiFive U74-MC (courtesy of
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 77 +++
1 file changed, 77 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..dbd97f40a4
---
From: Rémi Denis-Courmont
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 4 ++
libavcodec/riscv/aacpsdsp_rvv.S | 65
2 files changed, 69 insertions(+)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 20b1a12741..58a4c61121 100644
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 3 +++
libavcodec/riscv/aacpsdsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 76f55502ee..20b1a12741
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 14 +
libavcodec/riscv/aacpsdsp_rvv.S | 35
2 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index
From: Rémi Denis-Courmont
This starts with one-time initialisation of the 26 constant factors
like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with
the scalar instruction set. While the formula can readily be vectored,
the gains would (probably) be more than lost in transfering the
From: Rémi Denis-Courmont
This uses the following vectorisation:
for (i = 0; i < blocksize; i++) {
ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]);
mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]);
}
---
libavcodec/riscv/Makefile | 2 ++
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 6 +-
libavcodec/riscv/aacpsdsp_rvv.S | 17 +
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 525fc9aa38..90c9c501c3
From: Rémi Denis-Courmont
---
libavcodec/fmtconvert.c| 2 ++
libavcodec/fmtconvert.h| 1 +
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/fmtconvert_init.c | 39 ++
libavcodec/riscv/fmtconvert_rvv.S | 39
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 7 ++-
libavcodec/riscv/audiodsp_rvv.S | 17 +
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index ce8b60ee52..ddd561484f
From: Rémi Denis-Courmont
---
libavcodec/aacpsdsp.h| 1 +
libavcodec/aacpsdsp_template.c | 2 ++
libavcodec/riscv/Makefile| 2 ++
libavcodec/riscv/aacpsdsp_init.c | 37
libavcodec/riscv/aacpsdsp_rvv.S | 37
From: Rémi Denis-Courmont
---
libavcodec/riscv/fmtconvert_init.c | 7 ++-
libavcodec/riscv/fmtconvert_rvv.S | 28
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/fmtconvert_init.c
b/libavcodec/riscv/fmtconvert_init.c
index
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 2 ++
libavcodec/riscv/audiodsp_rvv.S | 19 +++
2 files changed, 21 insertions(+)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index ddd561484f..6f38b7bc83 100644
---
From: Rémi Denis-Courmont
---
libavcodec/riscv/Makefile| 1 +
libavcodec/riscv/audiodsp_init.c | 9
libavcodec/riscv/audiodsp_rvv.S | 36
3 files changed, 46 insertions(+)
create mode 100644 libavcodec/riscv/audiodsp_rvv.S
diff --git
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 33
2 files changed, 36 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 9b8fd9942b..dacd81c08b
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
libavutil/riscv/fixed_dsp_init.c | 38 ++
libavutil/riscv/fixed_dsp_rvv.S | 40
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 24 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index f164b1308f..9b8fd9942b 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 20
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index dacd81c08b..cc9b7e83dc 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index d17d0f66c5..2ddd2050f7 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 2ddd2050f7..f164b1308f 100644
---
From: Rémi Denis-Courmont
---
Makefile | 2 +-
configure| 15 +++
ffbuild/arch.mak | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 61f79e27ae..1fb742f390 100644
--- a/Makefile
+++ b/Makefile
@@ -91,7 +91,7 @@
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index b829c0f736..60b79bd59e 100644
---
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 21 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index c2d93e0cd7..d17d0f66c5 100644
---
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +-
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 60b79bd59e..6027a67b46
From: Rémi Denis-Courmont
Those mnemonics require the very latest binutils release at the time of
writing. These macros provide seamless backward compatibility.
---
libavutil/riscv/asm.S | 19 +++
1 file changed, 19 insertions(+)
diff --git a/libavutil/riscv/asm.S
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 9 -
libavutil/riscv/float_dsp_rvv.S | 17 +
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index de567c50d2..b829c0f736
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 6027a67b46..c2d93e0cd7 100644
---
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d:
From: Rémi Denis-Courmont
Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech):
get_pixels_c: 180.0
get_pixels_rvi: 136.7
---
libavcodec/pixblockdsp.c| 2 +
libavcodec/pixblockdsp.h| 2 +
libavcodec/riscv/Makefile | 2 +
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 77 +++
1 file changed, 77 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..dbd97f40a4
---
From: Rémi Denis-Courmont
RV64G supports MIN & MAX instructions natively only on floating point
registers, not general purpose ones. The later would require the Zbb
extension. Due to that, it is actually faster to perform the clipping
"properly" in FPU.
Benchmarks on SiFive U74-MC (courtesy of
From: Rémi Denis-Courmont
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 3 +++
libavcodec/riscv/aacpsdsp_rvv.S | 37
2 files changed, 40 insertions(+)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 76f55502ee..20b1a12741
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 6 +-
libavcodec/riscv/aacpsdsp_rvv.S | 19 +++
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index 525fc9aa38..90c9c501c3
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 2 ++
libavcodec/riscv/audiodsp_rvv.S | 20
2 files changed, 22 insertions(+)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index ddd561484f..6f38b7bc83 100644
---
From: Rémi Denis-Courmont
---
libavcodec/riscv/audiodsp_init.c | 7 ++-
libavcodec/riscv/audiodsp_rvv.S | 18 ++
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index ce8b60ee52..ddd561484f
From: Rémi Denis-Courmont
---
libavcodec/riscv/aacpsdsp_init.c | 14
libavcodec/riscv/aacpsdsp_rvv.S | 37
2 files changed, 47 insertions(+), 4 deletions(-)
diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c
index
From: Rémi Denis-Courmont
This starts with one-time initialisation of the 26 constant factors
like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with
the scalar instruction set. While the formula can readily be vectored,
the gains would (probably) be more than lost in transfering the
From: Rémi Denis-Courmont
---
libavcodec/aacpsdsp.h| 1 +
libavcodec/aacpsdsp_template.c | 2 ++
libavcodec/riscv/Makefile| 2 ++
libavcodec/riscv/aacpsdsp_init.c | 37 ++
libavcodec/riscv/aacpsdsp_rvv.S | 39
From: Rémi Denis-Courmont
This uses the following vectorisation:
for (i = 0; i < blocksize; i++) {
ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]);
mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]);
}
---
libavcodec/riscv/Makefile | 2 ++
From: Rémi Denis-Courmont
---
libavcodec/fmtconvert.c| 2 ++
libavcodec/fmtconvert.h| 1 +
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/fmtconvert_init.c | 39 +
libavcodec/riscv/fmtconvert_rvv.S | 40
From: Rémi Denis-Courmont
---
libavcodec/riscv/fmtconvert_init.c | 7 ++-
libavcodec/riscv/fmtconvert_rvv.S | 29 +
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/fmtconvert_init.c
b/libavcodec/riscv/fmtconvert_init.c
index
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d:
From: Rémi Denis-Courmont
---
libavcodec/riscv/Makefile| 1 +
libavcodec/riscv/audiodsp_init.c | 9
libavcodec/riscv/audiodsp_rvv.S | 37
3 files changed, 47 insertions(+)
create mode 100644 libavcodec/riscv/audiodsp_rvv.S
diff --git
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 23 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index dacd81c08b..cc9b7e83dc 100644
---
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
libavutil/riscv/fixed_dsp_init.c | 38 +
libavutil/riscv/fixed_dsp_rvv.S | 41
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 9b8fd9942b..dacd81c08b
1 - 100 of 251 matches
Mail list logo