Le perjantaina 25. elokuuta 2023, 17.58.40 EEST Anton Khirnov a écrit :
> > And then sometimes an argument has been argued to death previously and
> > there is really no point to rehash it again and again. If people cannot
> > agree, they should refer to the TC, not brute force the review through
Le torstaina 24. elokuuta 2023, 22.56.14 EEST Michael Niedermayer a écrit :
> Suggested text is from Anton
>
> Signed-off-by: Michael Niedermayer
> ---
> doc/developer.texi | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/doc/developer.texi b/doc/developer.texi
> index
---
libavutil/riscv/timer.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/riscv/timer.h b/libavutil/riscv/timer.h
index b418d13a26..df1a730b5e 100644
--- a/libavutil/riscv/timer.h
+++ b/libavutil/riscv/timer.h
@@ -48,6 +48,7 @@ static inline uint64_t ff_read_time(void)
}
AV_READ_TIME has no side effects. It does not need to be volatile.
---
libavutil/riscv/timer.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/riscv/timer.h b/libavutil/riscv/timer.h
index c2465a4524..b418d13a26 100644
--- a/libavutil/riscv/timer.h
+++
So far, AV_READ_TIME would return the cycle counter. This posed two
problems:
1) On recent systems, it would just raise an illegal instruction
exception. Indeed RDCYCLE is blocked in user space to ward off some
side channel attacks. In particular, this would cause the random
number
-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
diff --git a/libavcodec/riscv/g722dsp_init.c b/libavcodec/riscv/g722dsp_init.c
new file mode 100644
index 00..77e29bfb56
--- /dev/null
+++ b/libavcodec/riscv/g722dsp_init.c
@@ -0,0 +1,40 @@
+/*
+ * Copyright © 2023 Rémi Denis-Courmont.
+ *
+ * This file
--- a/libavcodec/riscv/aacpsdsp_rvv.S
+++ b/libavcodec/riscv/aacpsdsp_rvv.S
@@ -1,5 +1,5 @@
/*
- * Copyright © 2022 Rémi Denis-Courmont.
+ * Copyright © 2022-2023 Rémi Denis-Courmont.
*
* This file is part of FFmpeg.
*
@@ -20,13 +20,16 @@
#include "libavutil/riscv/asm.S"
-f
With 5 accumulator vectors and 6 inputs, this can only use LMUL=2.
Also the number of vector loop iterations is small, just 5 on 128-bit
vector hardware.
The vector loop is somewhat unusual in that it processes data in
descending memory order, in order to save on vector slides:
in descending
Given the size of the data set, strided memory accesses cannot be avoided.
We can still do better than the current code.
ps_hybrid_synthesis_deint_c: 12065.5
ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before)
ps_hybrid_synthesis_deint_rvv_i64: 8181.0 (after)
---
Le 10 novembre 2023 12:54:30 GMT+02:00, Hendrik Leppkes a
écrit :
>On Thu, Nov 9, 2023 at 6:04 PM Rémi Denis-Courmont wrote:
>>
>> Le torstaina 9. marraskuuta 2023, 18.50.52 EET Michael Niedermayer a écrit :
>> > that said, i checked ML subscribers and found
>
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.
In this particular
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
---
libswscale/riscv/rgb2rgb_rvv.S | 25 -
1 file changed, 12 insertions(+), 13 deletions(-)
diff
This is restricted to 128-bit vectors as larger vector sizes could read
past the end of the noise array. Support for future hardware with larger
vector sizes is left for some other time.
hf_apply_noise_0_c: 2319.7
hf_apply_noise_0_rvv_f32: 1229.0
hf_apply_noise_1_c: 2539.0
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.
This actually makes the zero case happen pseudorandomly.
---
tests/checkasm/sbrdsp.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git
Hi,
Le 9 novembre 2023 12:16:28 GMT+02:00, "Dawid Kozinski/Multimedia (PLT)
/SRPOL/Staff Engineer/Samsung Electronics" a écrit :
>Hi,
>
>Both, the implementation of the EVC encoder and decoder for FFmpeg depend on
>external libraries (at least for now). They are just wrappers using external
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
---
libswscale/riscv/rgb2rgb_rvv.S | 25 -
1 file changed, 12 insertions(+), 13 deletions(-)
diff
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.
In this particular
Le torstaina 9. marraskuuta 2023, 22.45.35 EET Alexander Strasser a écrit :
> I can't see how the reason for the presence of code can be ultimately
> defined objectively and non-arbitrary.
Ultimately, this was discussed and decided in a meeting, which Michael
attended (albeit remotely) and for
Le torstaina 9. marraskuuta 2023, 20.34.53 EET Rémi Denis-Courmont a écrit :
> In my personal opinion, we should not need to support unaligned YUY2
> pixel maps. They should always be aligned to at least 32 bits, and the
> current code assumes just 16 bits. However checkasm
Le torstaina 9. marraskuuta 2023, 19.41.53 EET Michael Niedermayer a écrit :
> On Thu, Nov 09, 2023 at 07:04:00PM +0200, Rémi Denis-Courmont wrote:
> > Le torstaina 9. marraskuuta 2023, 18.50.52 EET Michael Niedermayer a écrit
:
> > > that said, i checked ML subscribers
Le torstaina 9. marraskuuta 2023, 20.11.12 EET Cosmin Stejerean via ffmpeg-
devel a écrit :
> > On Nov 9, 2023, at 9:53 AM, Rémi Denis-Courmont wrote:
> >
> > The point is that, whether or not they are on the mailing list, people
> > should
> > not be v
hf_gen_c: 2922.7
hf_gen_rvv_f32: 731.5
---
libavcodec/riscv/sbrdsp_init.c | 4 +++
libavcodec/riscv/sbrdsp_rvv.S | 50 ++
2 files changed, 54 insertions(+)
diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c
index
With a value of zero, the function is a glorified memory copy.
---
tests/checkasm/sbrdsp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tests/checkasm/sbrdsp.c b/tests/checkasm/sbrdsp.c
index 2fb14d5bf8..5cc3b33215 100644
--- a/tests/checkasm/sbrdsp.c
+++
Considering the marginality of the measured performance gains (3-4%),
I suppose that we should not merge this. Furthermore those measurements
are not expected to improve with large vector sizes, since the code
uses only 32 bits per vector no matter what.
deemphasis_c: 7703.2
deemphasis_rvv_f32:
Sorry, git-send-email boo boo. Scratch this one message.
--
レミ・デニ-クールモン
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
.
Rémi Denis-Courmont (5):
doc: reference the RISC-V specification
lavu/riscv: AV_READ_TIME cycle counter
configure/riscv: detect fast CLZ
lavu/riscv: byte-swap operations
lavu/riscv: add optimisations
configure | 6 +++
doc
unused).
- Add checkasm support.
- Use ARM-like macros for FP ABIs.
This patchset is orthogonal to the RISC-V scalar patchset. They can be merged
in any order.
Rémi Denis-Courmont (12):
lavu/riscv: add CPU flags for the RISC
-09-03 21:56:31 +0300)
Rémi Denis-Courmont (3):
riscv: add CPU flags for the RISC-V Vector extension
riscv: initial common header for assembler macros
riscv: add float vector-scalar multiplication
libavutil/cpu.c
Le lauantaina 3. syyskuuta 2022, 22.48.45 EEST Lynne a écrit :
> Sep 3, 2022, 21:34 by r...@remlab.net:
> > Le lauantaina 3. syyskuuta 2022, 22.11.26 EEST Lynne a écrit :
> >> > diff --git a/libavutil/riscv/float_dsp_rvv.S
> >> > b/libavutil/riscv/float_dsp_rvv.S new file mode 100644
> >> > index
Le lauantaina 3. syyskuuta 2022, 22.01.45 EEST r...@remlab.net a écrit :
> +#define ZVE_UP_TO(cap) ((2 * (cap)) - 1)
Stray code. Ignore.
--
Rémi Denis-Courmont
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
ht
Le lauantaina 3. syyskuuta 2022, 22.11.26 EEST Lynne a écrit :
> > diff --git a/libavutil/riscv/float_dsp_rvv.S
> > b/libavutil/riscv/float_dsp_rvv.S new file mode 100644
> > index 00..54ea1d9d6d
> > --- /dev/null
> > +++ b/libavutil/riscv/float_dsp_rvv.S
> > @@ -0,0 +1,60 @@
> > +/*
> > +
Le lauantaina 3. syyskuuta 2022, 22.20.20 EEST Lynne a écrit :
> Sep 3, 2022, 21:01 by r...@remlab.net:
> > From: Rémi Denis-Courmont
> >
> > RVV defines a total of 12 different extensions: V, Zvl32b, Zvl64b,
> > Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zve32x, Zve32f, Zv
Le sunnuntaina 4. syyskuuta 2022, 0.38.32 EEST Lynne a écrit :
> I need to know the length in C, not assembly.
There may be some corner cases where that makes sense, but typically it
doesn't. Even if you're dealing in fixed-size macro blocks, you should leverage
the larger vectors to unroll and
Le sunnuntaina 4. syyskuuta 2022, 20.48.26 EEST Lynne a écrit :
> > The pointer arithmetic could be slightly optimised with SH2ADD and
> > SH3ADD instructions from the Zvba extension. This would require more
> > conditional code, or requiring support for Zvba for probably neglible
> > performance
support for Zvba for probably neglible
performance gains though.
Rémi Denis-Courmont (10):
riscv: add CPU flags for the RISC-V Vector extension
riscv: initial common header for assembler macros
riscv: float vector
Le sunnuntaina 4. syyskuuta 2022, 9.39.36 EEST Lynne a écrit :
> In particular, doing the tail, which consists of 2 equal length transforms.
> On AVX we interleave the coefficients from 2x4pt transforms during
> lookups since we can do them simultaneously and save on
> shuffles. Doing them
Le 28 septembre 2022 13:48:49 GMT+03:00, Anton Khirnov a
écrit :
>It uses size_t rather than unsigned for the size and conforms to our
>standard naming scheme.
>---
> configure | 5 -
> doc/APIchanges | 3 +++
> libavutil/mem.c | 30 ++
>
Le 27 septembre 2022 23:04:22 GMT+03:00, r...@remlab.net a écrit :
>From: Rémi Denis-Courmont
>
>---
> libavcodec/idctdsp.c| 2 ++
> libavcodec/idctdsp.h| 2 ++
> libavcodec/riscv/Makefile | 2 ++
> libavcodec/risc
Le 28 septembre 2022 10:13:57 GMT+03:00, "Martin Storsjö" a
écrit :
>Signed-off-by: Martin Storsjö
>---
>This should hopefully fix the compile failures on fate,
>http://fate.ffmpeg.org/report.cgi?time=20220927222508=riscv64-linux-gnu-gcc-12
>and
Le 28 septembre 2022 13:51:43 GMT+03:00, "Rémi Denis-Courmont"
a écrit :
>Le 28 septembre 2022 13:48:49 GMT+03:00, Anton Khirnov a
>écrit :
>>It uses size_t rather than unsigned for the size and conforms to our
>>standard naming scheme.
>>---
>> configur
Le 28 septembre 2022 15:52:46 GMT+03:00, Peter Krefting
a écrit :
>Hi!
>
>>> The DCBZL instruction is not available for the e500v1 and e500v2
>>> architectures, but may still be recognized by the toolchain, so we need to
>>> remove the test for it explicitly for these architectures.
>> Isn't
.git rvv-swscale
for you to fetch changes up to 18edd2c3108b126fc478635ac1048db60b9d7fc4:
sws/rgb2rgb: RISC-V 64-bit V packed YUYV/UYVY to planar 4:2:2 (2022-09-28
18:23:53 +0300)
----
Rémi Denis-Courmont (3):
sws/rgb2rgb: RISC
0300)
----
Rémi Denis-Courmont (7):
lavu/riscv: helper to read the vector length
lavc/idctdsp: RISC-V V put_pixels_clamped function
lavc/idctdsp: RISC-V V add_pixels_clamped function
lavc/idctdsp: RISC
The loop uses a 32-bit accumulator. The current code would only zero
the lower 16 bits thereof.
---
libavcodec/riscv/audiodsp_rvv.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S
index 8e8bbd2058..af1e07bef9
Le 20 octobre 2022 10:12:28 GMT+03:00, Anton Khirnov a
écrit :
>libass defines a non-static read_file() symbol, which causes conflicts
>with static linking.
>---
> fftools/ffmpeg.h | 2 +-
> fftools/ffmpeg_mux_init.c | 4 ++--
> fftools/ffmpeg_opt.c | 4 ++--
> 3 files changed, 5
Le 20 octobre 2022 10:29:47 GMT+03:00, "Helmut K. C. Tessarek"
a écrit :
>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA512
>
>On 2022-10-20 02:48, Nicolas George wrote:
>> Possibly. But between a library and a final program, the one who is at
>> fault when a non-namespaced symbol conflicts is
Le 18 octobre 2022 23:59:21 GMT+03:00, Henrik Gramner a
écrit :
>On Tue, Oct 18, 2022 at 6:54 PM Anton Khirnov wrote:
>> +static void thread_set_name(PerThreadContext *p)
>> +{
>> +AVCodecContext *avctx = p->avctx;
>> +int idx = p - p->parent->threads;
>> +char name[16];
>> +
>> +
Ping...
--
Rémi Denis-Courmont
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject
Ping
--
Rémi Denis-Courmont
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject
Le maanantaina 19. syyskuuta 2022, 18.48.11 EEST James Almer a écrit :
> On 9/19/2022 12:35 PM, r...@remlab.net wrote:
> > From: Rémi Denis-Courmont
> >
> > While this probably never overflows, we are better safe than sorry.
> >
> > The callback prototype s
Le maanantaina 19. syyskuuta 2022, 20.04.41 EEST James Almer a écrit :
> Signed-off-by: James Almer
> ---
> tests/checkasm/Makefile| 1 +
> tests/checkasm/checkasm.c | 3 ++
> tests/checkasm/checkasm.h | 1 +
> tests/checkasm/vorbisdsp.c | 88 ++
>
FWIW, set LGTM
--
Реми Дёни-Курмон
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject
; } AVCodecContext;
I agree that the dual mono flag should be exposed to the application somehow,
but isn't this a slient ABI break?
--
Rémi Denis-Courmont
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.o
Hmm. It looks like I accidentally dropped a fix-up while rebasing/squashing and
bow there's 8-bit instead of 16-bit clipping :-(
Will send the trivial fix in a few hours.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
/riscv: fixed vector sum-and-difference with RVV (2022-09-12 17:57:07
+0300)
Change since v4:
- Marked RVV makefile variablers as directory-specific to pave the way
for optimisations elsewhere than libavutil.
Rémi Denis-Courmont (18
Hi,
Down-casting to a signed type (here, int16_t) is implementation-defined. And
while normal compilers do the expected thing, with modulo-2^n complement,
sanitizers tend to dislike it.
AFAIK, the clean solution is via an union whence you assign the uint16_t
member, and then read the int16_t
Hi,
This adds configure, compile-time, run-time detection for RISC-V scalar
and vector extensions. Also a couple of scalar optimisations.
Rémi Denis-Courmont (6):
lavu/cpu: detect RISC-V base extensions
lavu/cpu
] (2022-09-25 17:19:09 +0300)
Rémi Denis-Courmont (31):
lavu/cpu: detect RISC-V base extensions
lavu/riscv: initial common header for assembler macros
lavc/audiodsp: RISC-V F vector_clipf
lavc/pixblockdsp: RISC-V I
Le 26 septembre 2022 09:53:19 GMT+03:00, Lynne a écrit :
>Sep 25, 2022, 16:25 by r...@remlab.net:
>
>> From: Rémi Denis-Courmont
>>
>> ---
>> libavutil/riscv/float_dsp_init.c | 9 -
>> libavutil/riscv/float_dsp_rvv.S | 17 +
>
Le 26 septembre 2022 09:51:43 GMT+03:00, Lynne a écrit :
>Sep 25, 2022, 16:25 by r...@remlab.net:
>
>> From: Rémi Denis-Courmont
>> -if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) {
>> +if ((flags & AV_CPU_FLAG_RV_ZV
Le 26 septembre 2022 13:51:44 GMT+03:00, Peter Krefting
a écrit :
>The DCBZL instruction is not available for the e500v1 and e500v2
>architectures, but may still be recognized by the toolchain, so we need to
>remove the test for it explicitly for these architectures.
Isn't this the sort of
to b969ffa8bccd0c94d311404b19abca0de2930c0a:
lavc/aacpsdsp: RISC-V V stereo_interpolate[0] (2022-09-26 17:47:48 +0300)
Rémi Denis-Courmont (31):
lavu/cpu: detect RISC-V base extensions
lavu/riscv: initial common header
Le 26 septembre 2022 10:05:23 GMT+03:00, Lynne a écrit :
>Sep 25, 2022, 16:25 by r...@remlab.net:
>
>> Hello,
>>
>> Changes since version version 5:
>> - Use shifted-add instructions where applicable (pointer arithmetic) to
>> minimise scalar operations to the absolute minimum.
>> - Add AAC PS
Yes, please.
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject
Denis-Courmont (29):
lavu/cpu: detect RISC-V base extensions
lavu/riscv: initial common header for assembler macros
lavc/audiodsp: RISC-V F vector_clipf
lavc/pixblockdsp: RISC-V I get_pixels
lavu/cpu: CPU flags for the RISC-V Vector extension
configure: probe RISC-V
Le keskiviikkona 14. syyskuuta 2022, 22.28.01 EEST Lynne a écrit :
> Sep 14, 2022, 19:50 by r...@remlab.net:
> > From: Rémi Denis-Courmont
> >
> > This introduces compile-tim and run-time CPU detection on RISC-V. In
> > practice, I doubt that FFmpeg will ever see
Hi,
This small series introduces the same CPU detection and assembler macros
as the earlier V extension stuff but sticking to a scalar use case.
Benchmark results are included in the last patch.
Rémi Denis-Courmont (3):
lavu
Le keskiviikkona 14. syyskuuta 2022, 20.50.31 EEST r...@remlab.net a écrit :
> From: Rémi Denis-Courmont
>
> RV64G supports MIN & MAX instructions natively only on floating point
> registers, not general purpose ones. The later would require the Zbb
> extension. Due to t
Le tiistaina 20. syyskuuta 2022, 18.00.09 EEST Andreas Rheinhardt a écrit :
> r...@remlab.net:
> > From: Rémi Denis-Courmont
> >
> > ---
> >
> > libavcodec/fmtconvert.c| 2 ++
> > libavcodec/fmtconvert.h| 1 +
>
:
https://git.remlab.net/git/ffmpeg.git rv-cpu
for you to fetch changes up to df310e72cd9b9756faa50d286de41cfaa0604e32:
lavc/aacpsdsp: RISC-V V mul_pair_single (2022-09-20 17:32:52 +0300)
----
Rémi Denis-Courmont (26):
lavu/cpu: det
Le tiistaina 13. syyskuuta 2022, 18.11.35 EEST Andreas Rheinhardt a écrit :
> r...@remlab.net:
> > From: Rémi Denis-Courmont
> >
> > INT_MAX is (typically) a value with 31 significant bits but float can
> > only represent 23 significant bits,
Le perjantaina 23. syyskuuta 2022, 20.40.30 EEST Zhao Zhili a écrit :
> From: Zhao Zhili
>
> Signed-off-by: Zhao Zhili
> ---
> libavcodec/mjpegdec.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c
> index
Le torstaina 22. syyskuuta 2022, 21.37.03 EEST r...@remlab.net a écrit :
> @@ -7596,6 +7608,9 @@ if enabled loongarch; then
> echo "LSX enabled ${lsx-no}"
> echo "LASX enabled ${lasx-no}"
> fi
> +if enabled riscv; then
> +echo "RISC-V Vector enabled
repository at:
git.remlab.net:git/ffmpeg.git rvv-vtype
for you to fetch changes up to 1aa3efa0563aaaed376a35c0e7c9fe53089c3a7e:
lavc/opusdsp: RISC-V V (256-bit vectors) postfilter (2022-10-01 15:24:42
+0300)
Rémi Denis
Le torstaina 29. syyskuuta 2022, 22.51.59 EEST r...@remlab.net a écrit :
> From: Rémi Denis-Courmont
>
> This saves almost exactly 25% on SiFive U74.
>
> deemphasis_c: 11536.2
> deemphasis_rvf: 8654.2
So well, if you trust godbolt.org, some better compiler is able to optim
Pointed out by Andreas Rheinhardt.
---
libavcodec/riscv/fmtconvert_rvv.S | 1 -
libavcodec/riscv/idctdsp_rvv.S | 1 -
libavcodec/riscv/pixblockdsp_rvi.S | 1 -
libavcodec/riscv/pixblockdsp_rvv.S | 1 -
libavcodec/riscv/vorbisdsp_rvv.S | 1 -
libavutil/riscv/asm.S | 2 --
Le 5 octobre 2022 08:00:00 GMT+03:00, Lynne a écrit :
>ffmpeg | branch: master | Lynne | Wed Oct 5 06:58:26 2022
>+0200| [b25c6a5704ac114e825577209a610f5e95abe6c0] | committer: Lynne
>
>riscv/alacdsp: drop config.h include
>
>>
On most cases, the vector type (VTYPE) for the RISC-V Vector extension
is supplied as an immediate value, with either of the VSETVLI or
VSETIVLI instructions. There is however a third instruction VSETVL
which takes the vector type from a general purpose register. That is so
the type can be
/riscv/opusdsp_init.c b/libavcodec/riscv/opusdsp_init.c
new file mode 100644
index 00..f1d2c871e3
--- /dev/null
+++ b/libavcodec/riscv/opusdsp_init.c
@@ -0,0 +1,42 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can
This adds a variant of the postfilter for use with 256-bit vectors.
As a single vector is then large enough to perform the scalar product,
the group multipler is reduced to just one at run-time.
The different vector type is passed via register. Unfortunately,
there is no VSETIVL instruction, so
This adds a variant of the postfilter for use with 512-bit vectors.
Half a vector is enough to perform the scalar product. Normally a whole
vector would be used anyhow. Indeed fractional multiplers are no faster
than the unit multipler.
But in this particular function, a full vector makes up 16
in the Git repository at:
git.remlab.net:git/ffmpeg.git rvv-vtype
for you to fetch changes up to 026b9eff9d11ff59601e8823ff961ac0fabf55f1:
lavc/opusdsp: RISC-V V (512-bit) postfilter (2022-10-05 19:09:01 +0300)
Rémi Denis
Le maanantaina 3. lokakuuta 2022, 18.06.42 EEST r...@remlab.net a écrit :
> From: Rémi Denis-Courmont
>
> VSETVLI xd, x0, ...' has rather nonobvious semantics:
> - If xd is x0, then it preserves the current vector length.
> - If xd is not x0, it sets the vector length to the su
Although the DSP function only uses single precision from RISC-V F, the
caller may leave double precision values in the spilled registers if the
calling convention supports double precision hardware floats. Then, we
need to save and restore FS registers as double precision.
Conversely, we do not
..73ca85f344
--- /dev/null
+++ b/tests/checkasm/riscv/checkasm.S
@@ -0,0 +1,178 @@
+/
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redi
[sp, #-48]
> -ldp d10, d11, [sp, #-32]
> -ldp d8, d9, [sp, #-16]
> +ldp d14, d15, [sp], #16
> +ldp d12, d13, [sp], #16
> +ldp d10, d11, [sp], #16
> +
Le sunnuntaina 9. lokakuuta 2022, 19.36.24 EEST Reimar Döffinger a écrit :
> > While this fixes the ABI violation, it introduces multiple data
> > dependencies on stack pointer due to write-back.
>
> That is true in principle, this is not done consistently at all.
I have not checked the FFmpeg
Le sunnuntaina 2. lokakuuta 2022, 19.26.21 EEST James Almer a écrit :
> On 10/2/2022 1:13 PM, Rémi Denis-Courmont wrote:
> > Le sunnuntaina 2. lokakuuta 2022, 18.43.23 EEST Michael Niedermayer a
écrit :
> >> Fixes: signed integer overflow: 2040812214 + 255101526 cannot be
> &
to cb5a7b0834cbb3c8264615c351154632def0334a:
lavc/bswapdsp: RISC-V V bswap16_buf (2022-10-02 14:50:57 +0300)
Rémi Denis-Courmont (4):
lavu/riscv: CPU flag for the Zbb extension
lavc/bswapdsp: RISC-V B bswap_buf
lavc/bswapdsp: RISC-V V
Le sunnuntaina 2. lokakuuta 2022, 18.43.23 EEST Michael Niedermayer a écrit :
> Fixes: signed integer overflow: 2040812214 + 255101526 cannot be represented
> in type 'int' Fixes:
> 51323/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-4791481
> 067503616
>
> Found-by: continuous
Le torstaina 29. syyskuuta 2022, 22.51.59 EEST r...@remlab.net a écrit :
> From: Rémi Denis-Courmont
>
> This saves almost exactly 25% on SiFive U74.
>
> deemphasis_c: 11536.2
> deemphasis_rvf: 8654.2
One can get the same result with the C version using -ffast-math.
Forget t
Le 28 septembre 2022 00:32:42 GMT+03:00, Lynne a écrit :
>Sep 27, 2022, 22:04 by r...@remlab.net:
>
>> Hello,
>>
>> As a general rule, scalable vector instruction sets should be used with the
>> largest possible vector length. There are however a number of operations that
>> just happen with a
Le 24 décembre 2022 14:07:26 GMT+01:00, Camille Oudot
a écrit :
>Hello,
>
>On Sat, 2022-12-24 at 13:36 +0200, Rémi Denis-Courmont wrote:
>> I don't see why you need an option for this. In parsing the SDP, it
>> should be self-evident if a given socket needs to be re
Le maanantaina 26. joulukuuta 2022, 23.47.17 EET Nicolas George a écrit :
> "zhilizhao(赵志立)" (12022-12-26):
> > Just use the same socket file descriptor. Don’t use OS dependent hack to
> > implement a feature.
>
> SO_REUSEADDR is absolutely not a hack.
So I agree that SO_REUSEADDR is "absolutely
Le keskiviikkona 11. tammikuuta 2023, 10.52.08 EET Paul B Mahol a écrit :
> > Sorry for the break, I’m trying to figure out how to make it compatible
> > with Windows.
>
> Is this even portable?
If you build FFmpeg correctly, so that all the FFmpeg libraries and the
application code share the
Le perjantaina 13. tammikuuta 2023, 5.37.36 EET zhilizhao(赵志立) a écrit :
> > On Jan 13, 2023, at 03:13, Rémi Denis-Courmont wrote:
> >
> > Le keskiviikkona 11. tammikuuta 2023, 10.52.08 EET Paul B Mahol a écrit :
> >>> Sorry for the break, I’m trying to figure
Le tiistaina 3. tammikuuta 2023, 11.03.30 EET Camille Oudot a écrit :
> Hi, I'm back on the topic. Thanks to all of you for your comments.
>
> > So I agree that SO_REUSEADDR is "absolutely not a hack"... if you use
> > it to recycle IP/port pair without waiting for the time-out. But
> > that's
to use REUSEPORT and BPF, but that will only work on Linux,
and that's not implemented in the patch.)
--
Rémi Denis-Courmont
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visi
Le maanantaina 5. joulukuuta 2022, 4.51.34 EET zhilizhao(赵志立) a écrit :
> > On Nov 19, 2022, at 02:48, Zhao Zhili wrote:
> >
> > From: Zhao Zhili
> >
> > Unlike the pipe protocol, fd protocol has seek support if it
> > corresponding to a regular file.
> > ---
> > Sometimes it's the only way to
Le sunnuntaina 11. joulukuuta 2022, 17.17.27 EET Zhao Zhili a écrit :
> From: Zhao Zhili
>
> Unlike the pipe protocol, fd protocol has seek support if it
> corresponding to a regular file.
> ---
> v2: dup the file descriptor for safety
>
> doc/protocols.texi | 24 ++
>
101 - 200 of 967 matches
Mail list logo