On 2016-11-18 00:13:51 +0200, Martin Storsjö wrote:
> On Thu, 17 Nov 2016, Janne Grunau wrote:
> 
> >On 2016-11-14 00:01:42 +0200, Martin Storsjö wrote:
> >>The subtraction directly on sp is one of the operations that are
> >>allowed in thumb mode.
> >>---
> >> libavcodec/arm/vp9itxfm_neon.S | 6 ++----
> >> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
> >>index cdb43b5..46fd010 100644
> >>--- a/libavcodec/arm/vp9itxfm_neon.S
> >>+++ b/libavcodec/arm/vp9itxfm_neon.S
> >>@@ -796,10 +796,9 @@ function ff_vp9_\txfm1\()_\txfm2\()_16x16_add_neon, 
> >>export=1
> >>         @ Align the stack, allocate a temp buffer
> >> T       mov             r12, sp
> >> T       bic             r12, r12, #15
> >>-T       sub             r12, r12, #512
> >> T       mov             sp,  r12
> >> A       bic             sp,  sp,  #15
> >>-A       sub             sp,  sp,  #512
> >>+        sub             sp,  sp,  #512
> >
> >sub r12, sp, #512 works too and saves the first mov
> 
> Oh, right
> 
> >or better
> >
> >T mov r7, sp
> >T bic r7, r7, #15
> >A bic r7, sp, #15
> 
> I guess that should be "and r7, r7/sp, #15" then

err, yes
> 
> > add r7, r7, #512
> > sub sp, r7
> >
> >and restore the stack with
> >add sp, r7
> 
> Hmm, that ends up as 4 instructions for thumb and 3 for arm, for aligning
> and subtracting. Isn't that equal to, not better than, what you achieve with
> your suggestion with "sub r12, sp, #512" for thumb?

yes, better only in the sense that arm and thumb share more one 
instruction more

Janne
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to