On 15/01/16 20:35, James Darnley wrote:
> Around 25% faster than the ssse3 version.
> ---
> @@ -27,7 +27,8 @@ DECLARE_ALIGNED(32, const ymm_reg,  ff_pw_1)    = { 
> 0x0001000100010001ULL, 0x000
>  DECLARE_ALIGNED(32, const ymm_reg,  ff_pw_2)    = { 0x0002000200020002ULL, 
> 0x0002000200020002ULL,
>                                                      0x0002000200020002ULL, 
> 0x0002000200020002ULL };
>  DECLARE_ALIGNED(16, const xmm_reg,  ff_pw_3)    = { 0x0003000300030003ULL, 
> 0x0003000300030003ULL };
> -DECLARE_ALIGNED(16, const xmm_reg,  ff_pw_4)    = { 0x0004000400040004ULL, 
> 0x0004000400040004ULL };
> +DECLARE_ALIGNED(32, const ymm_reg,  ff_pw_4)    = { 0x0004000400040004ULL, 
> 0x0004000400040004ULL,
> +                                                    0x0004000400040004ULL, 
> 0x0004000400040004ULL };

Here you extend v210_enc_min_10 to 32 bytes, v210_enc_max_10 shouldn't
be extended to 32 bytes as well?

lu
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to