Re: [FFmpeg-devel] Memcpy Operation Duration

Sven C. Dack Tue, 18 Oct 2016 12:45:13 -0700

On 18/10/16 20:26, Ali KIZIL wrote:

Hi Everyone,


Today, I was analyzing memcpy duration in FFmpeg. I noticed that it is
taking longer time compared to an optimized SSE, SSE2, MMX, MMX2, AVX or
AVX2 based memcpy operation.

I tried march=corei7-avx2 compiled FFmpeg version, it does not change the
duration of memcpy operation.
I also folowed https://trac.ffmpeg.org/wiki/CompilationGuide#PerformanceTips
.Same result. In addition, I tried gcc 6.2 if gcc if gcc is not selecting
the correct flag. Same result again.

This memcpy operations effect the fps decoding (and probably encoding)
rates.

In a case that uyvy422 to p010 3840x2160 unscaled convertion in rawvideo,
fps rate increased from 44 fps to 52 fps on a Xeon E5 2630 v4.

Do I miss anything when compiling FFmpeg for AVX2 or other flag optimised,
or there need a fix in FFmpeg to direct some (or all)  memcpy operations to
a inherited memcpy operation which can decide flag for optimisation ?
Or there is no such need and I am on a wrong path ?

(As a side note, FFmpeg works performance on i7 Extreme cores compared to
Xeon v4 processors.)

Kind Regards,
Ali KIZIL
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Could be it's gcc's built-in version. It's been said that libc is occasionallybetter at it than gcc's built-in version.


Use -fno-builtin-memcpy and see what difference it makes.


_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] Memcpy Operation Duration

Reply via email to