On Thu, 17 Nov 2016, Janne Grunau wrote:

On 2016-11-14 00:01:42 +0200, Martin Storsjö wrote:
The subtraction directly on sp is one of the operations that are
allowed in thumb mode.
---
 libavcodec/arm/vp9itxfm_neon.S | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
index cdb43b5..46fd010 100644
--- a/libavcodec/arm/vp9itxfm_neon.S
+++ b/libavcodec/arm/vp9itxfm_neon.S
@@ -796,10 +796,9 @@ function ff_vp9_\txfm1\()_\txfm2\()_16x16_add_neon, 
export=1
         @ Align the stack, allocate a temp buffer
 T       mov             r12, sp
 T       bic             r12, r12, #15
-T       sub             r12, r12, #512
 T       mov             sp,  r12
 A       bic             sp,  sp,  #15
-A       sub             sp,  sp,  #512
+        sub             sp,  sp,  #512

sub r12, sp, #512 works too and saves the first mov

Oh, right

or better

T mov r7, sp
T bic r7, r7, #15
A bic r7, sp, #15

I guess that should be "and r7, r7/sp, #15" then

 add r7, r7, #512
 sub sp, r7

and restore the stack with
add sp, r7

Hmm, that ends up as 4 instructions for thumb and 3 for arm, for aligning and subtracting. Isn't that equal to, not better than, what you achieve with your suggestion with "sub r12, sp, #512" for thumb?

// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to