On Sun, Feb 11, 2024 at 4:21 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote:
> Besides simplifying address computations (it saves 432B of .text > in hevcdsp.o alone here) it also fixes undefined behaviour that > occurs if mx or my are 0 (happens when the filters are unused) > because they lead to an array index of -1 in the old code. > This happens in the checkasm-hevc_pel FATE-test. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinha...@outlook.com> > --- > The loongarch and mips parts of this are untested. Luckily we have a > loongarch patchwork runner... > > libavcodec/hevcdsp.c | 6 +- > libavcodec/hevcdsp.h | 5 +- > libavcodec/hevcdsp_template.c | 38 ++-- > libavcodec/loongarch/hevc_mc.S | 224 +++++------------------- > libavcodec/loongarch/hevc_mc_bi_lsx.c | 6 +- > libavcodec/loongarch/hevc_mc_uni_lsx.c | 6 +- > libavcodec/loongarch/hevc_mc_uniw_lsx.c | 4 +- > libavcodec/loongarch/hevcdsp_lsx.c | 6 +- > libavcodec/mips/hevc_mc_bi_msa.c | 6 +- > libavcodec/mips/hevc_mc_biw_msa.c | 6 +- > libavcodec/mips/hevc_mc_uni_msa.c | 6 +- > libavcodec/mips/hevc_mc_uniw_msa.c | 6 +- > libavcodec/mips/hevcdsp_mmi.c | 20 +-- > libavcodec/mips/hevcdsp_msa.c | 6 +- > libavcodec/x86/hevcdsp_init.c | 4 +- > 15 files changed, 112 insertions(+), 237 deletions(-) > > diff --git a/libavcodec/hevcdsp.c b/libavcodec/hevcdsp.c > index 2ca551df1d..630fdc012e 100644 > --- a/libavcodec/hevcdsp.c > +++ b/libavcodec/hevcdsp.c > @@ -91,7 +91,8 @@ static const int8_t transform[32][32] = { > 90, -90, 88, -85, 82, -78, 73, -67, 61, -54, 46, -38, 31, > -22, 13, -4 }, > }; > > -DECLARE_ALIGNED(16, const int8_t, ff_hevc_epel_filters)[7][4] = { > +DECLARE_ALIGNED(16, const int8_t, ff_hevc_epel_filters)[8][4] = { > + { 0 }, > { -2, 58, 10, -2}, > { -4, 54, 16, -2}, > { -6, 46, 28, -4}, > @@ -101,7 +102,8 @@ DECLARE_ALIGNED(16, const int8_t, > ff_hevc_epel_filters)[7][4] = { > { -2, 10, 58, -2}, > }; > > -DECLARE_ALIGNED(16, const int8_t, ff_hevc_qpel_filters)[3][16] = { > +DECLARE_ALIGNED(16, const int8_t, ff_hevc_qpel_filters)[4][16] = { > Do you know why this is [4][16]? [4][8] should suffice. If some architecture requires 16, we might need to update VVC_INTER_LUMA_TAPS to 16 in the future. Thank you > + { 0 }, > { -1, 4,-10, 58, 17, -5, 1, 0, -1, 4,-10, 58, 17, -5, 1, 0}, > { -1, 4,-11, 40, 40,-11, 4, -1, -1, 4,-11, 40, 40,-11, 4, -1}, > { 0, 1, -5, 17, 58,-10, 4, -1, 0, 1, -5, 17, 58,-10, 4, -1} > diff --git a/libavcodec/hevcdsp.h b/libavcodec/hevcdsp.h > index 1b9c5bb6bc..a5933dcac4 100644 > > -- > 2.34.1 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".