Le keskiviikkona 4. lokakuuta 2023, 22.51.10 EEST Rémi Denis-Courmont a écrit : > + vwmul.vv v12, v16, v28 > + vwmul.vv v8, v16, v24 > + // vwnmsac.vv does _not_ exist so multiply & subtract separately > + vwmul.vv v4, v20, v24 > + vwmacc.vv v8, v20, v28
Retaining 3 whole double-width multiply vectors is really hurting here. This would be a lot smoother with just the top-halves (vsmul.vv), but then we need to round each multiplication. Of course the final results would differ slightly then, and I am not sure what that entails for the audio codecs (nor would I know how to update the x86 code). N.B.: Don't ask for benchmarks, this can't run on v0.7.1 proto-RVV hardware. -- 雷米‧德尼-库尔蒙 http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".