On 2016-11-23 15:00:49 +0200, Martin Storsjö wrote:
> ---
>  libavcodec/aarch64/vp9itxfm_neon.S | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/libavcodec/aarch64/vp9itxfm_neon.S 
> b/libavcodec/aarch64/vp9itxfm_neon.S
> index 2dc6b75..21352f0 100644
> --- a/libavcodec/aarch64/vp9itxfm_neon.S
> +++ b/libavcodec/aarch64/vp9itxfm_neon.S
> @@ -1001,7 +1001,6 @@ function idct32_1d_8x32_pass2_neon
>  
>          idct16
>  
> -        mov             x9,  #128
>  .irp i, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
>          st1             {v\i\().8h}, [x2], x9
>  .endr
> @@ -1018,7 +1017,6 @@ function idct32_1d_8x32_pass2_neon
>  
>          idct32_odd
>  
> -        mov             x9,  #128
>  .macro load_acc_store a, b, c, d, neg=0
>          ld1             {v4.8h},  [x2], x9
>          ld1             {v5.8h},  [x2], x9

patch ok although there still many redundant mov x9 left. It's enough to 
have it once in ff_vp9_idct_idct_32x32_add_neon (and a neg x7, x9) and 
use x7 instead of x9 in the neg=1 part of load_acc_store

The same applies to the 16x16 transforms

Janne
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to