On 20 September 2015 at 17:08, Adam <[email protected]> wrote: > Daniel, can you clarify your comment of "the first two lines require > memory allocation and might also have a bad memory profile"? I'm not sure > if it's addressed in this latest gist or not. >
I don't know how much you know about computers, so forgive me if I end up telling you things you already know: In modern computer architectures, CPUs are extremely fast compared to RAM. Often the CPU spends most of the time waiting for data to arrive. When your program needs data, it first tries to get it from the local CPU cache. If the data is not in cache, it has to get it from RAM and the CPU has to wait. Often, the absolute best way to optimize a program is to minimize cache misses. The way to do that is to access data in the same order that it is stored in memory. Re-arranging memory is usually bad, because it requires copying memory and often requires accessing data out-of-order, which leads to cache misses. This is an improvement! With that said, the Julia code is still 1.8x to 2x > slower than the Matlab code. Any tips on additional changes I can make so I > can (greatly) outperform the Matlab code? > Have a look at the new version I posted. I got an additional 4x improvement by removing all the Dict's. Basically, the Dict's forced Julia's compiler to produce very generic code. Using concrete types allowed the compiler to optimize the code better. Cheers, Daniel.
