Hi, I'm a new Julia user, and I decided to write a slightly optimized QR decomposition for tridiagonal matrices (using Householder projections).
My code (and it's output) is here: https://gist.github.com/lightcatcher/8118181 . I've confirmed that both implementations are correct (they match each other and the output from qr and np.linalg.qr), but the Julia code consistently takes twice as long as the Python code. >From profiling the Julia code, I know that a very large part of the runtime (> 90%) is spent doing the matrix multiplications to update matrices R and Qt. I checked that my numpy installation is using LAPACK and BLAS in /usr/lib (ie not using MKL), but I don't know how to check which library Julia is using for matrix operations (but I assume its the same one). Further interesting tidbits: In Julia, using the built-in 'qr' takes ~.13s. In Python, using np.linalg.qr takes ~.25s, so my Python tridiagonal QR decomposition method is faster (as it takes ~0.2s). Also, its (marginally but consistently) faster in Julia to just call "copy(T)" rather than "full(Tridiagonal(map(i -> diag(T, i), -1:1)...))", which seems odd to me because "copy" should be O(N^2) and the other method (only copying the sub, main, and super diagonals) is O(N), where N is the dimension of a square matrix. So, can anyone shed some light on why my Julia code is 2x as slow as the very similar Python code? Also, a tangential question: Does Julia have an equivalent to "if __name__ == '__main__'"? (in other words how can I automatically run main). Thanks for the help, Eric
