Our tests show that CPU clocks are reduced for each module:
~48% for qpel weight
~17% for epel
~71% for sao edge mode
~48% for sao band mode
~60% for idct of 16x16 block
And overall decoding speeds up by 20~30% (increase of FPS).

We also compared the decoding results to make sure they are the same
before and after the optimization.

These patches are based on the n3.4 release.

Meng Wang (5):
  avcodec/hevcdsp: Add NEON optimization for qpel weighted mode
  avcodec/hevcdsp: Add NEON optimization for epel
  avcodec/hevcdsp: Use pre-load (pld) to optimize data loading
  avcodec/hevcdsp: Add NEON optimization for sao
  avcodec/hevcdsp: Add NEON optimization for idct16x16

Shengbin Meng (1):
  avcodec/hevcdsp: Add NEON optimization for whole-pixel interpolation

 libavcodec/arm/Makefile            |    4 +-
 libavcodec/arm/hevcdsp_epel_neon.S | 2078 ++++++++++++++++++++++++++++++++++++
 libavcodec/arm/hevcdsp_idct_neon.S |  241 +++++
 libavcodec/arm/hevcdsp_init_neon.c |  695 ++++++++++++
 libavcodec/arm/hevcdsp_qpel_neon.S |  702 ++++++++++++
 libavcodec/arm/hevcdsp_sao_neon.S  |  181 ++++
 6 files changed, 3900 insertions(+), 1 deletion(-)
 create mode 100644 libavcodec/arm/hevcdsp_epel_neon.S
 create mode 100644 libavcodec/arm/hevcdsp_sao_neon.S

-- 
2.13.6 (Apple Git-96)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to