Hi Glen, what and how are you measuring, in order to repeat measurements in my environment.
We use the cycle counter of the CORTEX processor (seems you do the same), interrupt overhead is removed (counter is stopped during the major interrupt, which is the audio irq). I have some prerecorded "frames" stored in flash which are feed into the freedv_comprx routine. Measurements are taken every 50 frames, since initially the decoder needs to lock on the data. After 50 frames I get stable measurements. I did some hotspot analysis and a single kiss_fft 512 fft call was taking around 670uS (*168) = 112500 cycles. So this is in line with your numbers. My results for kiss_fft vs arm_fft may not be correct, since in this case fact I was running kiss_fftr vs. arm_rfft_fast_f32. Maybe there is a huge penalty for the arm_rfft_fast_f32 code. Let me try the kiss_fft vs. arm_cfft case to confirm your measurements. Danilo On 18.09.2016 06:07, glen english wrote: > arm fft, stm32F405RGT6, 6WS > -O2, debug level2. > > encode 1200 (40mS frame) > 1745799 : 10.39mS > decode > 2292497 : 13.64mS > > so, 2x speed of my other runs > > let's run 5WS > > encode 1200 (40mS frame) > 1579922 : 9.4mS > decode > 2073979 : 12.34mS > > WOW the wait states hurt on the M4 !!!! > > > > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Freetel-codec2 mailing list > Freetel-codec2@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freetel-codec2 ------------------------------------------------------------------------------ _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2