Le keskiviikkona 4. lokakuuta 2023, 22.51.10 EEST Rémi Denis-Courmont a écrit 
:
> +        vwmul.vv v12, v16, v28
> +        vwmul.vv v8, v16, v24
> +        // vwnmsac.vv does _not_ exist so multiply & subtract separately
> +        vwmul.vv v4, v20, v24
> +        vwmacc.vv v8, v20, v28

Retaining 3 whole double-width multiply vectors is really hurting here. This 
would be a lot smoother with just the top-halves (vsmul.vv), but then we need 
to round each multiplication. Of course the final results would differ slightly 
then, and I am not sure what that entails for the audio codecs (nor would I 
know how to update the x86 code).

N.B.: Don't ask for benchmarks, this can't run on v0.7.1 proto-RVV hardware.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to