First look at Julia, I read somewhere that it is advised to de-vectorize 
code so I just tried this:

function matmul(a,b)
    c=zeros(typeof(a[1,1]),(size(a,1),size(b,2)))
    for j = 1:size(b,2)
        for i =1:size(a,1)
            for k = 1:size(b,1)
                c[i,j]+=a[i,k]*b[k,j]
            end
        end
    end
    c
end


function matmul2(a,b)
    a*b
end


a=rand(2,3);
b=rand(3,4);
c=matmul(a,b);   #just to make the JIT 
c1=matmul2(a,b); #compile the functions ahed of @time
a=rand(6000,500);
b=rand(500,8000);
@time(matmul(a,b);)
@time(matmul2(a,b);)



and I got that:

elapsed time: 150.661463517 seconds (384000192 bytes allocated)
elapsed time: 0.990317124 seconds (384000192 bytes allocated)


the code for matrix multiplication I assume is some kind of BLAS maybe in 
fortran (or assembler?) maybe optimized for SSE2, for sure using all my 4 cores 
so this is not the typical example where de-vectorizing is advisable...


nonetheless, isn't it a factor of 150 a bit higher than expected? I missed 
something important in the matmul code?

Reply via email to