ping flow gg <hlefthl...@gmail.com> 于2024年3月8日周五 17:46写道:
> Alright, using m8, but for now don't add code to address dependencies in > loops that have a minor impact. Updated in the reply > > Rémi Denis-Courmont <r...@remlab.net> 于2024年3月8日周五 17:08写道: > >> >> >> Le 8 mars 2024 02:45:46 GMT+02:00, flow gg <hlefthl...@gmail.com> a >> écrit : >> >> Isn't it also faster to max LMUL for the adds here? >> > >> >It requires the use of one more vset, making the time slightly longer: >> >147.7 (m1), 148.7 (m8 + vset). >> >> A variation of 0.6% on a single set of kernels will end up below >> measurement noise in real overall codec usage. And then reducing the >> I-cache contention can improve performance in other ways. Larger LMUL >> should also improve performance on bigger cores with more ALUs. So it's not >> all black and white. >> >> My personal preference is to keep the code small if it makes almost no >> difference but I'm not BDFL. >> >> >Also this might not be much noticeable on C908, but avoiding sequential >> >dependencies on the address registers may help. I mean, avoid using as >> >address >> >operand a value that was calculated by the immediate previous >> instruction. >> > >> >> Okay, but the test results haven't changed.. >> >It would add more than ten lines of code, perhaps shorter code will >> better? >> >> I don't know. There are definitely in-order vector cores coming, and data >> dependencies will hurt them. But I don't know if anyone will care about >> FFmpeg on those. >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". >> > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".