On 04/02/15 9:39 AM, Christophe Gisquet wrote: > Are the first number for each case from before you split out the > restore part? Otherwise, that's gruesome.
The benchmarks were done with every patch up to this one applied, so yes, after the split. The file i used to bench went from ~36fps to ~46fps after this patch. The C version must be pretty inefficient (That CMP macro inside the loop probably creates lots of branches). Or maybe GCC was dumb. > As seen from above, srcstride is constant and is 2*MAX_PB_SIZE + > FF_INPUT_BUFFER_PADDING_SIZE. > That may save you one whole gpr. Not really useful here, but I think > you are more limited for the>8 bits case. > If you want to exploit this, also add it above void (*sao_edge_filter[5]) Ah, good to know it's constant now. Although until we add x86_32 versions of these functions it doesn't really bring any real benefit. I'll update the prototype and assembly anyway. For that matter, do .asm files have access to FF_INPUT_BUFFER_PADDING_SIZE? if at some point we change its value (For example, once AVX512 code starts being committed), then srcstride will be something else. Probably not a problem since whenever that constant is updated in avcodec.h it can also be updated in hevc_sao.asm, but it would be nice not having to bother doing that. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel