Tim, you're absolutely right- it turns out that both arrays were Real, and 
it speeds right up when I convert them to Float64.

Thank you very much.

On Wednesday, January 21, 2015 at 7:41:15 AM UTC-5, Tim Holy wrote:
>
> A few more points: 
> - Simon's suggestion of checking the types is spot on; if one is a matrix 
> of 
> Any, for example, you're doomed to a slower code path. If the types are 
> Array{Float64,2} and Array{Float64,1}, that's not the problem. 
> - It should be slightly faster if you change your parentheses, 
> shrt*(diagm(expr)*shrt'). Jutho's scale suggestion would be even better. 
> - Try the profiler (see the docs). If you're running julia 0.4 and see 
> it's 
> spending a lot of time in generic_matmatmul, do a git pull and 
> rebuild---it 
> should fix the problem. 
>
> --Tim 
>
> On Wednesday, January 21, 2015 01:05:06 AM Simon Byrne wrote: 
> > As Jutho said, this shouldn't happen, but is difficult to diagnose 
> without 
> > further information. What are the types of shrt and expr? (these can be 
> > found using the typeof function). 
> > 
> > Simon 
> > 
> > On Wednesday, 21 January 2015 07:59:34 UTC, Jutho wrote: 
> > > Not sure what is causing the slowness, but you could avoid creating a 
> > > diagonal matrix and then doing the matrix multiplication with 
> diagm(expr) 
> > > which will be treated as a full matrix. 
> > > Instead of shrt*diagm(expr) which is interpreted as the multiplication 
> of 
> > > two full matrices, try scale(shrt,expr) . 
> > > 
> > > Op woensdag 21 januari 2015 07:56:19 UTC+1 schreef Micah McClimans: 
> > >> I'm running into trouble with a line of matrix multiplication going 
> very 
> > >> slowly in one of my programs. The line I'm looking into is: 
> > >> objectivematrix=shrt*diagm(expr)*(shrt') 
> > >> where shrt is 12,000x600 and expr is 600 long. This line takes 
> several 
> > >> HOURS to run, on a computer that can run 
> > >> 
> > >> k=rand(12000,12000) 
> > >> k3=k*k*k 
> > >> 
> > >> in under a minute. I've tried devectorizing the line into the 
> following 
> > >> loop (shrt is block-diagonal with each block ONevecs and -ONevecs 
> > >> respectively, so I split the loop in half) 
> > >> 
> > >> objectivematrix=zeros(2*size(ONevecs,1),2*size(ONevecs,1)) 
> > >> for i in 1:size(ONevecs,1) 
> > >> 
> > >>     print(i) 
> > >>     for j in 1:size(ONevecs,1) 
> > >>     
> > >>         for k in 1:size(ONevecs,2) 
> > >>         objectivematrix[i,j]+=ONevecs[i,k]*ONevecs[j,k]*expr[k] 
> > >>         end 
> > >>     
> > >>     end 
> > >> 
> > >> end 
> > >> for i in 1:size(ONevecs,1) 
> > >> 
> > >>     print(i) 
> > >>     for j in 1:size(ONevecs,1) 
> > >>     
> > >>         for k in 1:size(ONevecs,2) 
> > >> 
> > >> 
> objectivematrix[i+size(ONevecs,1),j+size(ONevecs,1)]+=ONevecs[i,k]*ONevec 
> > >> s[j,k]*expr[k+size(ONevecs,2)]>> 
> > >>         end 
> > >>     
> > >>     end 
> > >> 
> > >> end 
> > >> 
> > >> and this give a print out every couple seconds- it's faster than the 
> > >> matrix multiplication version, but not enough. Why is this taking so 
> > >> long? 
> > >> This should not be a hard operation. 
>
>

Reply via email to