I generally assume for estimations, FPU ops = INT ops per clock, and if
you are careful you can do simultaneous INt/FPU ops...
Just watch out for non aligned floating point accesses. bang... not lots
of __aligned__ used.
I used to be a full-on SHARC guy. but I wonder where that market is now
with M7 and M4F around , and NEON which if you know what you are doing
can run rings around a SHARC.
Just diehards i think. The simple thigns with SHARC have disappeared
with cache involvement.
On 8/04/2018 10:51 AM, Dana Myers wrote:
On 4/7/2018 5:22 PM, Bruce Perens wrote:
It could also be the use of memory barrier instructions. I'd like to
benchmark Codec2 rather than a simple floating point loop with
volatile variables. But if we are to believe the times on the screen
of the esp32 in the video, he was getting acceptable performance.
From what I've seen, the Cortex-M FPUs basically give single-precision
FP add/sub/mul in the same number of clocks as integer operations.
With a proper program store cache, Cortex-M4F is quite the rocket,
really.
Now I want to hunt down the appropriate Tensilica reference for the core
in the ESP32; it occurs to me the two cores may be sharing one FPU,
though I don't immediately see how the simple test would incur context
switching frequently.
73,
Dana K6JQ
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2