Hi,

On Thu, Apr 5, 2012 at 4:15 PM, Ronald S. Bultje <[email protected]> wrote:
> On Thu, Apr 5, 2012 at 1:29 PM, Christophe Gisquet
> <[email protected]> wrote:
>> 2012/4/5 Ronald S. Bultje <[email protected]>:
>>> This looks OK. Is there a performance benefit? I assume there isn't
>>> anything measurable, because the overhead is relatively low?
>>
>> Yes, the split between prescaled/non-scaled cases shaves like 3 cycles
>> per call, and leads to a result within measure noise.
>>
>> However, this makes it clear that such a distinction should be made
>> (I'm thinking of neon code here). For the SSSE3 case, the non-scaled
>> version is ~217 cycles, and the prescaled (producing identical
>> results) is ~141.
>>
>>> As for the code, do please document the arrays in rv34dsp.h, so we
>>> don't have to look at the code to figure out what the difference
>>> between [0][0] and [1][1] is.
>>
>> Done. The commit message is also more verbose, but in the end, it
>> would be interesting to know if it is good enough for someone
>> implementing code based on this.
>>
>> Another (clearer?) solution would be to have:
>> rv40_weight_func rv40_nonscaled_biweight[2];
>> rv40_weight_func rv40_prescaled_biweight[2];
>> in RV34DSPContext
>> and have function pointers set to the correct values in RV34DecContex.
>
> No, this is OK. LGTM.

And pushed (sorry, was somewhat slow last week).

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to