2014-02-09 Christophe Gisquet <[email protected]>:
> This in particular allows to remove inline asm, which is the case for
> x86 in this patch.

And another patch that can be applied on top to make the default
implementation a bit more optimized.

-- 
Christophe
From 592aaf0872cfca3b2178ff00e2765329d496aeb4 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet <[email protected]>
Date: Sat, 8 Feb 2014 18:12:56 +0100
Subject: [PATCH 9/9] dcadsp: perform linear access with offset

This seems to simplify noticeably the addressing. Timings before/after
for respectively win32 and win64 are 260/222 and 242/231.
---
 libavcodec/dcadsp.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/libavcodec/dcadsp.c b/libavcodec/dcadsp.c
index 1e09bd3..f1b804d 100644
--- a/libavcodec/dcadsp.c
+++ b/libavcodec/dcadsp.c
@@ -30,14 +30,17 @@ static void decode_hf_c(float dst[DCA_SUBBANDS][8],
                         int32_t scale[DCA_SUBBANDS][2],
                         intptr_t start, intptr_t end)
 {
-    int l;
+    int i, l;
+    const int8_t *pvq = hf_vq[0] + vq_offset;
     for (l = start; l < end; l++) {
         /* 1 vector -> 32 samples but we only need the 8 samples
          * for this subsubframe. */
-        int   i, hfvq = vq_num[l];
+        const int8_t *ptr = pvq + vq_num[l]*32;
         float fscale = scale[l][0] / 16.0;
-        for (i = 0; i < 8; i++)
-            dst[l][i] = hf_vq[hfvq][vq_offset + i] * fscale;
+        for (i = 0; i < 8; i++) {
+            // hf_vq[hfvq][vq_offset + i] * fscale
+            dst[l][i] = ptr[i] * fscale;
+        }
     }
 }
 
-- 
1.8.0.msysgit.0

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to