Re: [FFmpeg-devel] [RFC]lavu/x86/pixelutils: Call emms before returning.

2016-10-02 Thread Michael Niedermayer
On Sun, Oct 02, 2016 at 10:53:17PM +0200, Carl Eugen Hoyos wrote:
> Hi!
> 
> The functions in libavutil/x86/pixelutils.asm are exported 
> to the library users if I understand the code correctly.
> I suspect it can be expected that the MMX state is reset 
> after returning.
> 
> Fixes the pixelutils fate test with musl on x86-32.
> 
> Please comment, Carl Eugen

>  pixelutils.asm |5 +
>  1 file changed, 5 insertions(+)
> a7b729f55ec6170b1c4b1959a4be7530926368ce  
> 0001-lavu-x86-pixelutils-Call-emms-before-returning.patch
> From f16831e6dc0f36c8f290d927bf51142c64fe8afc Mon Sep 17 00:00:00 2001
> From: Carl Eugen Hoyos 
> Date: Sun, 2 Oct 2016 22:45:51 +0200
> Subject: [PATCH] lavu/x86/pixelutils: Call emms before returning.
> 
> The functions are exported and the library user can
> expect that the MMX state is reset after returning.
> ---
>  libavutil/x86/pixelutils.asm |5 +
>  1 file changed, 5 insertions(+)

I think the performance impact of this on some CPUs is too large
to call this at a 8x8 and 16x16 granularity

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Breaking DRM is a little like attempting to break through a door even
though the window is wide open and the only thing in the house is a bunch
of things you dont want and which you would get tomorrow for free anyway


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [RFC]lavu/x86/pixelutils: Call emms before returning.

2016-10-02 Thread Hendrik Leppkes
On Sun, Oct 2, 2016 at 10:53 PM, Carl Eugen Hoyos  wrote:
> Hi!
>
> The functions in libavutil/x86/pixelutils.asm are exported
> to the library users if I understand the code correctly.
> I suspect it can be expected that the MMX state is reset
> after returning.
>
> Fixes the pixelutils fate test with musl on x86-32.
>
> Please comment, Carl Eugen
>

These functions are called in tight loops, and emms is then called in
the user-code after the loop. Calling emms inside the DSP function
would result in a performance degredation.

If a test doesn't call emms after using it, then that should be fixed.
And the docs should perhaps be amended to make note of that. Many DSP
functions work like that, to allow using them in tight loops without
the overhead.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [RFC]lavu/x86/pixelutils: Call emms before returning.

2016-10-02 Thread Carl Eugen Hoyos
Hi!

The functions in libavutil/x86/pixelutils.asm are exported 
to the library users if I understand the code correctly.
I suspect it can be expected that the MMX state is reset 
after returning.

Fixes the pixelutils fate test with musl on x86-32.

Please comment, Carl Eugen
From f16831e6dc0f36c8f290d927bf51142c64fe8afc Mon Sep 17 00:00:00 2001
From: Carl Eugen Hoyos 
Date: Sun, 2 Oct 2016 22:45:51 +0200
Subject: [PATCH] lavu/x86/pixelutils: Call emms before returning.

The functions are exported and the library user can
expect that the MMX state is reset after returning.
---
 libavutil/x86/pixelutils.asm |5 +
 1 file changed, 5 insertions(+)

diff --git a/libavutil/x86/pixelutils.asm b/libavutil/x86/pixelutils.asm
index 7af3007..092eb6e 100644
--- a/libavutil/x86/pixelutils.asm
+++ b/libavutil/x86/pixelutils.asm
@@ -61,6 +61,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, 
stride2
 paddw   m6, m0
 movdeax, m6
 movzx   eax, ax
+emms
 RET
 
 
;---
@@ -81,6 +82,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, 
stride2
 lea src2q, [src2q + 2*stride2q]
 %endrep
 movdeax, m2
+emms
 RET
 
 
;---
@@ -101,6 +103,7 @@ cglobal pixelutils_sad_16x16, 4,4,0, src1, stride1, src2, 
stride2
 add src2q, stride2q
 %endrep
 movdeax, m2
+emms
 RET
 
 
;---
@@ -131,6 +134,7 @@ cglobal pixelutils_sad_16x16, 4,4,5, src1, stride1, src2, 
stride2
 movhlps m0, m4
 paddw   m4, m0
 movdeax, m4
+emms
 RET
 
 
;---
@@ -158,6 +162,7 @@ cglobal pixelutils_sad_%1_16x16, 4,4,3, src1, stride1, 
src2, stride2
 movhlps m0, m2
 paddw   m2, m0
 movdeax, m2
+emms
 RET
 %endmacro
 
-- 
1.7.10.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel