Re: [FFmpeg-devel] [PATCH v3 2/2] lavc/aarch64: add pred16x16 10-bit functions

2021-04-14 Thread Mikhail Nitenko
thank you, i've tried your suggestions and here is what I got (the left column is A53 and the right is A72) current code: pred16x16_top_dc_10_c: 106.093.2 pred16x16_top_dc_10_neon:87.777.5 ld1, add, addv variant: pred16x16_top_dc_10_c: 106.095.5

Re: [FFmpeg-devel] [PATCH v3 2/2] lavc/aarch64: add pred16x16 10-bit functions

2021-04-14 Thread chen
Inlined a few comments for ff_pred16x16_top_dc_neon_10, other are similar. At 2021-04-14 20:35:44, "Martin Storsjö" wrote: >On Tue, 13 Apr 2021, Mikhail Nitenko wrote: > >> Benchmarks: >> pred16x16_dc_10_c: 124.0 >> pred16x16_dc_10_neon: 97.2 >> pred16x16_horizontal_10_c: 71.7 >>

Re: [FFmpeg-devel] [PATCH v3 2/2] lavc/aarch64: add pred16x16 10-bit functions

2021-04-14 Thread Martin Storsjö
On Tue, 13 Apr 2021, Mikhail Nitenko wrote: Benchmarks: pred16x16_dc_10_c: 124.0 pred16x16_dc_10_neon: 97.2 pred16x16_horizontal_10_c: 71.7 pred16x16_horizontal_10_neon: 66.2 pred16x16_top_dc_10_c: 90.7 pred16x16_top_dc_10_neon: 71.5 pred16x16_vertical_10_c: 64.7 pred16x16_vertical_10_neon:

[FFmpeg-devel] [PATCH v3 2/2] lavc/aarch64: add pred16x16 10-bit functions

2021-04-13 Thread Mikhail Nitenko
Benchmarks: pred16x16_dc_10_c: 124.0 pred16x16_dc_10_neon: 97.2 pred16x16_horizontal_10_c: 71.7 pred16x16_horizontal_10_neon: 66.2 pred16x16_top_dc_10_c: 90.7 pred16x16_top_dc_10_neon: 71.5 pred16x16_vertical_10_c: 64.7 pred16x16_vertical_10_neon: 61.7 Some functions work slower than C and are