Hi Danilo Yes, there is most-certainly a penalty for const access from flash, at least on the M4.
and of course instruction cache is no use, is just that, only for instructions I wonder what the bus matrix penalty is for data fetches from flash. There is an app-note about it somewhere I once read. It depends on what else is going on. The processor is pretty smart and interleaving the accesses as not to stall the pipeline or bus matrix. As Danilo I am sure you know : (pointed out for others) You can force variables into sections (like forcing a static const ) by using __attribute__((section("name"))) to assign say into data. I will some time run up the code on an M4 and see what kiss-fft does. I am very very very surprised , and do not really believe that kissFFT is as fast as the arm assembler on the M4 - my immediate thoughts are "you are doing it wrong". so, I will investigate. g On 18/09/2016 12:13 PM, Danilo Beuche wrote: > Hi Glen, > > just checked, it seems to me that we have all the caches running in the > mcHF: > > https://github.com/df8oe/mchf-github/blob/670f94a2e69a55a03f099ad25390925c84c09201/mchf-eclipse/cmsis_boot/system_stm32f4xx.c#L424 > > So even with these caches/buffers enabled, the M4 looses performance by > data reads from flash. Haven't checked the manual/internet but maybe the > flash caching works well for code but not in the same way for data. > > > Danilo > > Am 18.09.2016 um 03:12 schrieb glen english: >> Hi Danilo >> >> Good thoughts and points. >> >> while on the RAM subject : require reading for every serious programmer : >> "Memory" >> https://lwn.net/Archives/GuestIndex/#Drepper_Ulrich >> >> read all 7 parts, 100 pages, but if you only have an hour, just read >> "part 2 - cache" >> >> On the fft: >> >> I am not surprised that the ARM lib hand optimized assembler is that >> much faster. >> more that 2x faster.... in fact. >> >> I don't think kiss-fft is particular suitable for this sort of platform, >> either, I'll hold back what I really think :-) . >> >> The 5WS on flash (actually 6WS I am running @ 168M) does not really >> affect the performance too much. In fact I can vary the WS count +/- 2 >> without much change- the ART and the prefetch and the instruction and >> data caches are doing their job, so there is very little difference with >> the const values in ram or cache. >> >> In fact, most FFT implementations are very tough on a machine with cache . >> Have you read the paper on how FFTW works ? It is very cache aware- and >> adaptive to the architecture- that is why it does trial runs and picks >> the best. >> >> The M7 is very impressive. It is certainly impressive work by ARM. >> >> However, the M4 is what all of you have to work with so we can stay >> focussed on that. >> >> I think also the ram usage will be significantly less with the arm FFT >> because of the re-entrant Kiss-fft behaviour. >> >> The m4 is quite a different beast, and no D-cache can improve >> performance over the M7 for some (inaptly) written applications (not >> this one- but as a generalization for applications grabbing a byte from >> memory randomly and all over a large dataset) >> >> Large matrix operations are where cache machines fall over- that is once >> the dataset is bigger than the cache.... >> >> The question is how much optimization is enough. I am tempted NOT to >> optimize any more, although I feel (just by looking at it ) I can get >> another 2x out of it..... Why- well there is no real pressing need. >> Going too far away from the reference code will island the code a bit. >> However, if you run out of modem cycles/ modem ram, then we can probably >> get a bit more... >> >> cheers >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Freetel-codec2 mailing list >> Freetel-codec2@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 > > > ------------------------------------------------------------------------------ > _______________________________________________ > Freetel-codec2 mailing list > Freetel-codec2@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freetel-codec2 > ------------------------------------------------------------------------------ _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2