2017-11-18 18:35 GMT+01:00 Rafal Dabrowa <fatwild...@gmail.com>:

> For performance testing the following command was used:
>
>     time ./ffmpeg -hide_banner -i ~/bbb-1280x720-cfg06.mkv -f yuv4mpegpipe - 
> >/dev/null

An alternative is:
./ffmpeg -benchmark -i ~/bbb-1280x720-cfg06.mkv -f null -

> The video file was pre-read before test to minimize disk reads during testing.
> Program execution time without optimization was as follows:
>
> real    11m48.576s
> user    43m8.111s
> sys     0m12.469s
>
> Execution time with optimizations:
>
> real    6m17.046s
> user    21m19.792s
> sys     0m14.724s

Looks impressive.


> +av_cold void ff_hevc_dsp_init_aarch64(HEVCDSPContext *c, const int bit_depth)
> +{
> +    int cpu_flags = av_get_cpu_flags();
> +
> +    if (have_neon(cpu_flags) && bit_depth == 8) {
> +        NEON8_FNASSIGN(c->put_hevc_epel, 0, 0, pel_pixels);
> +        NEON8_FNASSIGN(c->put_hevc_epel, 0, 1, epel_h);
> +        NEON8_FNASSIGN(c->put_hevc_epel, 1, 0, epel_v);
> +        NEON8_FNASSIGN(c->put_hevc_epel, 1, 1, epel_hv);
> +        NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 0, epel_uni_v);
> +        NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 1, epel_uni_hv);
> +        NEON8_FNASSIGN(c->put_hevc_epel_bi, 0, 0, pel_bi_pixels);
> +        NEON8_FNASSIGN(c->put_hevc_epel_bi, 0, 1, epel_bi_h);
> +        NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 0, epel_bi_v);
> +        NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 1, epel_bi_hv);
> +        NEON8_FNASSIGN(c->put_hevc_qpel, 0, 0, pel_pixels);
> +        NEON8_FNASSIGN(c->put_hevc_qpel, 0, 1, qpel_h);
> +        NEON8_FNASSIGN(c->put_hevc_qpel, 1, 0, qpel_v);
> +        NEON8_FNASSIGN(c->put_hevc_qpel, 1, 1, qpel_hv);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_uni, 0, 1, qpel_uni_h);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 0, qpel_uni_v);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 1, qpel_uni_hv);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_bi, 0, 0, pel_bi_pixels);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_bi, 0, 1, qpel_bi_h);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 0, qpel_bi_v);
> +        NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 1, qpel_bi_hv);

I wonder if it would have made sense to test and send that patches
in smaller portions, so that those with possible improvements
can be identified.

Thank you, Carl Eugen
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to