Hi all, I've got an algorithm that hinges critically on fast matrix multiplication. I put up the function on this gist
https://gist.github.com/floswald/6dea493417912536688d#file-tensor-jl-L45 indicating the line (45) that takes most of the time, as you can see in the profile output that is there as well. I am trying to figure out if I'm doing something wrong here or if that line just takes as long as it takes. I have to do this many times, so if this takes too long I have to change my strategy. The core of the problem looks like that for imat in 2:nbm v0 = copy(v1) stemp = ibm[ks[imat]] n = size(stemp,1) m = nall / n for i in 1:m v1[m*(0:(n-1)) + i] = stemp * v0[(n*(i-1)) + (1:n)] end end where v are vectors and stemp is a matrix. I spend a lot of time in the matrix multiplication line on the innermost loop. Any suggestions would be much appreciated. Thanks!
