Hi, On Thu, Apr 5, 2012 at 4:15 PM, Ronald S. Bultje <[email protected]> wrote: > On Thu, Apr 5, 2012 at 1:29 PM, Christophe Gisquet > <[email protected]> wrote: >> 2012/4/5 Ronald S. Bultje <[email protected]>: >>> This looks OK. Is there a performance benefit? I assume there isn't >>> anything measurable, because the overhead is relatively low? >> >> Yes, the split between prescaled/non-scaled cases shaves like 3 cycles >> per call, and leads to a result within measure noise. >> >> However, this makes it clear that such a distinction should be made >> (I'm thinking of neon code here). For the SSSE3 case, the non-scaled >> version is ~217 cycles, and the prescaled (producing identical >> results) is ~141. >> >>> As for the code, do please document the arrays in rv34dsp.h, so we >>> don't have to look at the code to figure out what the difference >>> between [0][0] and [1][1] is. >> >> Done. The commit message is also more verbose, but in the end, it >> would be interesting to know if it is good enough for someone >> implementing code based on this. >> >> Another (clearer?) solution would be to have: >> rv40_weight_func rv40_nonscaled_biweight[2]; >> rv40_weight_func rv40_prescaled_biweight[2]; >> in RV34DSPContext >> and have function pointers set to the correct values in RV34DecContex. > > No, this is OK. LGTM.
And pushed (sorry, was somewhat slow last week). Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
