Hi all,
I managed to improve codec2 performance even 10% more on ARM NEON. I
replaced some math functions with those from math-neon project (my libc
version is 2.13). So overall ARM speedup becomes 25% in my case.
Here are the oprofile reports on Exynos 4.
Vanilla:
samples % linenr info image name
symbol name
16436 45.8453 kiss_fft.c:246 libcodec2.so.0.0.0
kf_work
3089 8.6162 s_floor.c:44 libm-2.13.so
floorl
2166 6.0417 nlp.c:209 libcodec2.so.0.0.0 nlp
1760 4.9092 e_atan2.c:80 libm-2.13.so
__ieee754_atan2
1741 4.8562 fft.c:84 libcodec2.so.0.0.0 fft
1306 3.6429 s_sin.c:353 libm-2.13.so cosl
956 2.6666 (no location information) no-vmlinux
/no-vmlinux
877 2.4462 sine.c:288 libcodec2.so.0.0.0
hs_pitch_refinement
691 1.9274 lsp.c:143 libcodec2.so.0.0.0
lpc_to_lsp
682 1.9023 lpc.c:75 libcodec2.so.0.0.0
autocorrelate
657 1.8326 phase.c:61 libcodec2.so.0.0.0
aks_to_H
626 1.7461 quantise.c:479 libcodec2.so.0.0.0
aks_to_M2
624 1.7405 s_sin.c:90 libm-2.13.so sinl
449 1.2524 sine.c:395 libcodec2.so.0.0.0
est_voicing_mbe
326 0.9093 e_log.c:69 libm-2.13.so
__ieee754_log
322 0.8982 sine.c:564 libcodec2.so.0.0.0
synthesise
276 0.7699 sine.c:351 libcodec2.so.0.0.0
estimate_amplitudes
263 0.7336 random.c:293 libc-2.13.so
random
228 0.6360 sine.c:207 libcodec2.so.0.0.0
dft_speech
math-neon:
samples % linenr info image name
symbol name
3369 49.2976 kiss_fft.c:246 libcodec2.so.0.0.0
kf_work
438 6.4091 nlp.c:209 libcodec2.so.0.0.0 nlp
413 6.0433 sine.c:288 libcodec2.so.0.0.0
hs_pitch_refinement
347 5.0776 fft.c:84 libcodec2.so.0.0.0 fft
339 4.9605 math_floorf.c:39 libmath_neon.so.0.0.0
floorf_neon_hfp
227 3.3216 (no location information) no-vmlinux
/no-vmlinux
146 2.1364 lpc.c:78 libcodec2.so.0.0.0
autocorrelate
140 2.0486 s_sin.c:353 libm-2.13.so cosl
133 1.9462 math_floorf.c:54 libmath_neon.so.0.0.0
floorf_neon_sfp
132 1.9315 lsp.c:143 libcodec2.so.0.0.0
lpc_to_lsp
131 1.9169 quantise.c:479 libcodec2.so.0.0.0
aks_to_M2
121 1.7706 math_sinf.c:73 libmath_neon.so.0.0.0
sinf_neon_hfp
98 1.4340 e_log.c:69 libm-2.13.so
__ieee754_log
81 1.1853 math_atan2f.c:96 libmath_neon.so.0.0.0
atan2f_neon_hfp
78 1.1414 phase.c:61 libcodec2.so.0.0.0
aks_to_H
62 0.9072 sine.c:564 libcodec2.so.0.0.0
synthesise
58 0.8487 phase.c:200 libcodec2.so.0.0.0
phase_synth_zero_order
43 0.6292 sine.c:206 libcodec2.so.0.0.0
dft_speech
41 0.5999 random.c:293 libc-2.13.so
random
math-neon+libavcodec FFT:
samples % linenr info image name
symbol name
665 36.1610 (no location information) libavcodec.so.53.7.0
/usr/lib/libavcodec.so.53.7.0
225 12.2349 (no location information) no-vmlinux
/no-vmlinux
131 7.1234 sine.c:288 libcodec2.so.0.0.0
hs_pitch_refinement
127 6.9059 nlp.c:209 libcodec2.so.0.0.0 nlp
103 5.6009 fft.c:183 libcodec2.so.0.0.0 fft
85 4.6221 math_floorf.c:39 libmath_neon.so.0.0.0
floorf_neon_hfp
42 2.2838 lsp.c:143 libcodec2.so.0.0.0
lpc_to_lsp
42 2.2838 s_sin.c:353 libm-2.13.so cosl
41 2.2295 math_floorf.c:54 libmath_neon.so.0.0.0
floorf_neon_sfp
39 2.1207 quantise.c:479 libcodec2.so.0.0.0
aks_to_M2
39 2.1207 lpc.c:75 libcodec2.so.0.0.0
autocorrelate
34 1.8488 math_sinf.c:73 libmath_neon.so.0.0.0
sinf_neon_hfp
22 1.1963 e_log.c:69 libm-2.13.so
__ieee754_log
22 1.1963 math_atan2f.c:96 libmath_neon.so.0.0.0
atan2f_neon_hfp
18 0.9788 phase.c:200 libcodec2.so.0.0.0
phase_synth_zero_order
17 0.9244 interp.c:0 libc-2.13.so
memcpy
16 0.8700 sine.c:206 libcodec2.so.0.0.0
dft_speech
16 0.8700 math_sinf.c:114 libmath_neon.so.0.0.0
sinf_neon_sfp
15 0.8157 sine.c:564 libcodec2.so.0.0.0
synthesise
The github code was updated.
I wonder, what if one could profile speex and do the same math-neon trick:
Regards,
Vadim Markovtsev,
Engineer, Algorithmic Lab,
Moscow R&D center, Samsung Electronics
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Freetel-codec2 mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freetel-codec2