On Thu, 17 Nov 2016, Janne Grunau wrote:
On 2016-11-14 00:01:42 +0200, Martin Storsjö wrote:
The subtraction directly on sp is one of the operations that are
allowed in thumb mode.
---
libavcodec/arm/vp9itxfm_neon.S | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
index cdb43b5..46fd010 100644
--- a/libavcodec/arm/vp9itxfm_neon.S
+++ b/libavcodec/arm/vp9itxfm_neon.S
@@ -796,10 +796,9 @@ function ff_vp9_\txfm1\()_\txfm2\()_16x16_add_neon,
export=1
@ Align the stack, allocate a temp buffer
T mov r12, sp
T bic r12, r12, #15
-T sub r12, r12, #512
T mov sp, r12
A bic sp, sp, #15
-A sub sp, sp, #512
+ sub sp, sp, #512
sub r12, sp, #512 works too and saves the first mov
Oh, right
or better
T mov r7, sp
T bic r7, r7, #15
A bic r7, sp, #15
I guess that should be "and r7, r7/sp, #15" then
add r7, r7, #512
sub sp, r7
and restore the stack with
add sp, r7
Hmm, that ends up as 4 instructions for thumb and 3 for arm, for aligning
and subtracting. Isn't that equal to, not better than, what you achieve
with your suggestion with "sub r12, sp, #512" for thumb?
// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel