vc1dsp: R-V V mspel_pixels

flow gg Thu, 07 Mar 2024 16:46:10 -0800

> Isn't it also faster to max LMUL for the adds here?

It requires the use of one more vset, making the time slightly longer:
147.7 (m1), 148.7 (m8 + vset).


Also this might not be much noticeable on C908, but avoiding sequential
dependencies on the address registers may help. I mean, avoid using as
address
operand a value that was calculated by the immediate previous instruction.

> Okay, but the test results haven't changed..
It would add more than ten lines of code, perhaps shorter code will better?

Rémi Denis-Courmont <r...@remlab.net> 于2024年3月8日周五 02:55写道：

> Le lauantaina 2. maaliskuuta 2024, 14.06.13 EET flow gg a écrit :
> > Here adjusting the order, rather than simply using .rept, will be 13%-24%
> > faster.
>
> Isn't it also faster to max LMUL for the adds here?
>
> Also this might not be much noticeable on C908, but avoiding sequential
> dependencies on the address registers may help. I mean, avoid using as
> address
> operand a value that was calculated by the immediate previous instruction.
>
> --
> Rémi Denis-Courmont
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: R-V V mspel_pixels

Reply via email to