Personally it seems pretty fast only like a max of 9 us for most ops on floats 
(doubles are a different story.) It can be as low as almost 1 us or as high as 
55 us depending on operation. 

> On Feb 6, 2020, at 12:45 AM, DANA MYERS <[email protected]> wrote:
> 
> 
> 
>>> On February 5, 2020 at 9:25 PM Bruce Perens via Freetel-codec2 
>>> <[email protected]> wrote: 
>>> 
>>> Dana, the only thing you didn't make clear is whether your code is using 
>>> the fixed or floating data type. If it's using the floating one, it would 
>>> be interesting to isolate why performance is so poor when more conventional 
>>> code is generated by the compiler. I can understand float code being 
>>> slightly slower than double, if the hardware FPU is implemented in double 
>>> size, as it normally would be.
>> Yes, I am using floating types on both the Cortex-M4F and ESP32. My
>> apologies for calling-out M4F without mentioning the significance of the
>> 'F' :-).
>> 
>> MCU FPUs are, in my limited experience (Cortex-M4F and ESP32),
>> single-precision. IIRC, higher-end parts (Cortex-M7) may feature 
>> double-precision.
>> 
>> I don't know why the optimized assembly is 2x faster than compiled code;
>> that would be a question for Espressif/Tensilica, I suppose.
>> The floating performance as previously benchmarked is poor enough that I 
>> wondered whether there was really hardware, or whether some of that blobby 
>> code was processing float in an exception handler.
> As did I. So I gave it a try.
> 
> Cheers,
> Dana
> 
>> On Wed, Feb 5, 2020, 8:16 PM Dana Myers < [email protected]> wrote: 
>> On 2/5/2020 4:25 PM, Bruce Perens via Freetel-codec2 wrote: 
>> > Bill, before you go any farther oh, you should make a floating point 
>> > benchmark. I don't believe the necessary performance is there. 
>> 
>> I used to think that, but then Espressif released their ESP-DSP library. I 
>> ported my 
>> Bell 202 modem from Cortex-M4F using CMSIS-DSP to ESP32 running at 240MHz 
>> using ESP-DSP and see comparable performance per clock, and my actual modem 
>> is single-threaded and thus uses only one of the ESP32's two cores. It's 
>> conceivable 
>> if some additional latency is tolerable and the algorithm divisible, it 
>> could be split 
>> over the two cores. 
>> 
>> Espressif offers both "ANSI C" portable functions and ESP32-specific 
>> assembly 
>> functions - the latter are considerably faster and what I am using. 
>> 
>> Cheers, 
>> Dana  K6JQ 
>> 
>> 
>> 
>> _______________________________________________ 
>> Freetel-codec2 mailing list 
>> [email protected] 
>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 
>> _______________________________________________ 
>> Freetel-codec2 mailing list 
>> [email protected] 
>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 
> 
>  
> _______________________________________________
> Freetel-codec2 mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
_______________________________________________
Freetel-codec2 mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to