Hello,
Good news, after some work it seem I've got a working solution.
@Steve, note that the CMSIS do not provide function for *atan2()* and
*floor()...*
First I've tested to do only single precision operation (do not allow
double operation because M4 cannot handle double precision operation on the
hardware FPU).
In Codec2 it seem that a lot of operation involve double which seem not
always necessary, for example all the define M_PI and other value which are
not defined whith the suffix 'f' are considered by C ANSI code to be double.
So operation with those define are double operation not optimized.
The same occur each time you do an operation with a litteral without the
'f' suffix. For example (0.5*x or 0.5+x is a double operation which I
change to "0.5f*x" and "0.5f+x").
After fixing this in the Codec, I came to a speedup of at least 10% for the
decoding!!
I cannot say if this is a good hint for speed up and if you can live with
the loss of precision but if yes, maybe this fix can be done in your main
repository
After that I tested some other compile option and optimization (O2 and some
other inlining) without success.
Finally I tested to pass the floating point mode of the target from
"strict" to "relaxed".
See the definition in wiki:
Relaxed mode prioritizes speed over strict correctness. In relaxed mode,
> the compiler may perform speed optimizations at the expense of reducing the
> precision of some calculations, typically a tiny amount. For instance,
> (X/3) is not precisely equivalent to (X*(1.0/3)), but in relaxed mode, the
> compiler is allowed to make this transformation anyway, as multiplication
> is much faster than division.
Changing that provide me a speed up of about 45%!!!!
Here are the encoding time after all fix:
-Encoding time *without *modification was between *25ms *and* 42ms / *After
modification it is between *: ** 18ms *and *19ms *so a speed up of about
55%. My processor will be loaded at *48%* for encoding sound at 8000Khz.
-Decoding time *without* modification was between *39ms *and
*56ms** / *After
modification it is between *: ** 23ms *and *27ms *so a speed up of about
52%. My processor will be loaded at *68%* for decoding sound at 8000Khz.
I've played back the encoded stream at 1200bps and 1300bps and everything
seem okay: I cannot hear any strong difference between the encoded version
with modification and without my modification.
Hope that this will help some other people to get it working on their
target.
Regards,
Max
2016-06-18 9:54 GMT+02:00 glen english <[email protected]>:
> RRR
> I usually find O3 fractionally faster but alot of things break that I
> dont expect (bad programming habits?). they don't break in O2. some
> unexpected assumptions are made...
>
>
> On 18/06/2016 1:25 PM, Steve wrote:
> > Another algorithm that seems to suck a lot of CPU is
> > phase_synth_zero_order() in decoding, and really the only thing in
> > there is atan2() and floor(). (you've already changed the sin/cos). So
> > maybe the CMSIS has a better version for those two?
> >
> > I know floor() is really a slow algorithm in gcc.
> >
> > http://stackoverflow.com/questions/824118/why-is-floor-so-slow
> >
> >
>
>
>
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> patterns at an interface-level. Reveals which users, apps, and protocols
> are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning
> reports. http://sdm.link/zohomanageengine
> _______________________________________________
> Freetel-codec2 mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
>
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://sdm.link/zohomanageengine
_______________________________________________
Freetel-codec2 mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freetel-codec2