>From the profile output it looks like a lot of time is spent in getindex. I suppose that is bad news? Not sure how I could avoid any of the index lookups.
On Tuesday, 29 July 2014, Dahua Lin <[email protected] <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > You may have to check which is the bottleneck: getindex or matrix > multiplication. > > Dahua > > On Tuesday, July 29, 2014 4:22:32 PM UTC-5, Florian Oswald wrote: >> >> Hi all, >> >> I've got an algorithm that hinges critically on fast matrix >> multiplication. I put up the function on this gist >> >> https://gist.github.com/floswald/6dea493417912536688d#file-tensor-jl-L45 >> >> indicating the line (45) that takes most of the time, as you can see in >> the profile output that is there as well. I am trying to figure out if I'm >> doing something wrong here or if that line just takes as long as it takes. >> I have to do this many times, so if this takes too long I have to change my >> strategy. >> >> The core of the problem looks like that >> >> for imat in 2:nbm >> v0 = copy(v1) >> stemp = ibm[ks[imat]] >> n = size(stemp,1) >> m = nall / n >> for i in 1:m >> v1[m*(0:(n-1)) + i] = stemp * v0[(n*(i-1)) + (1:n)] >> end >> end >> >> >> where v are vectors and stemp is a matrix. I spend a lot of time in the >> matrix multiplication line on the innermost loop. Any suggestions would be >> much appreciated. Thanks! >> >> >>
