Le tiistaina 26. syyskuuta 2023, 21.40.12 EEST Paul B Mahol a écrit : > On Tue, Sep 26, 2023 at 8:35 PM Rémi Denis-Courmont <r...@remlab.net> wrote: > > Le tiistaina 26. syyskuuta 2023, 12.24.58 EEST flow gg a écrit : > > > benchmark: > > > fcmul_add_c: 19.7 > > > fcmul_add_rvv_f32: 6.7 > > > > Nit: please pad mnemonics to at least 8 columns for consistency. > > > > I'm a bit surprised that the performance improves this much, considering > > that > > the C910 is notoriously bad at both segmented strided loads. It might be > > that > > the C versions is just very bad due to lack of aliasing optimisations. Oh > > well. > > What you mean exactly that C version is missing?
The C version does not have any restrict qualifier. This potentially prevents the C compiler from unrolling. Adding the keyword can improve performance gains of 20-30% on RISC-V scalar floating point. That said, sometimes you can't validly use restrict, and you simply can't tell the C compiler how to optimise properly. In those cases, even scalar floating point optimisations improve performance. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".