For cache optimization you have to consider the line size, the cache line hashing scheme, and how many lines match the same hash. The classic example is doing three-operand array processing using cache aligned buffers, with two-way set-associative caching. You end up with a flush and load per operation. This used to hit us with SGI workstations using MIPS processors.
On Tue, Jul 28, 2015 at 7:15 PM, Steve <[email protected]> wrote: > Yes, the frame of data. kissfft uses one array for input, and another for > output, but if I recall, the input is never used. Although, I'm sure that > gives it a speed advantage. Anyway, it will be used on the next pass :-) > > I used the code from JTransforms, where the internal tables are about 2K so > not much advantage there, but I'm sure ARM has optimized memory code. > > ------------------------------------------------------------------------------ > > _______________________________________________ > Freetel-codec2 mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/freetel-codec2 > ------------------------------------------------------------------------------ _______________________________________________ Freetel-codec2 mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/freetel-codec2
