From: Jason Garrett-Glaser <[email protected]>

This prevents a call to bytestream_get_be16() using a movzwl both before
and after the ror instruction, which is obviously inefficient. Arm uses
the same trick also.

Sintel decoding goes from (avg+SD) 9.856 +/- 0.003 to 9.797 +/- 0.003 sec.

Signed-off-by: Ronald S. Bultje <[email protected]>
---
 libavutil/x86/bswap.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavutil/x86/bswap.h b/libavutil/x86/bswap.h
index 28e3fec..b60d9cc 100644
--- a/libavutil/x86/bswap.h
+++ b/libavutil/x86/bswap.h
@@ -29,9 +29,9 @@
 #include "libavutil/attributes.h"
 
 #define av_bswap16 av_bswap16
-static av_always_inline av_const uint16_t av_bswap16(uint16_t x)
+static av_always_inline av_const unsigned av_bswap16(unsigned x)
 {
-    __asm__("rorw $8, %0" : "+r"(x));
+    __asm__("rorw $8, %w0" : "+r"(x));
     return x;
 }
 
-- 
1.7.2.1

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to