Thanks for the quick response,

If I understand correctly, this is similar to the first stagnation of 
http://www.stanford.edu/~jacobm/matrixmultiply.html<http://www.google.com/url?q=http%3A%2F%2Fwww.stanford.edu%2F~jacobm%2Fmatrixmultiply.html&sa=D&sntz=1&usg=AFQjCNESX7Q6ZyhYgfEShN1FqVYliCxOxQ>
 for 
values in the range 50-200, at a factor of 1.3 or something times the BLAS 
speed. I completely overlooked this before.

So to make a fair comparison to that c implementation, I have to compare 
the Julia speed (10-15 times BLAS speed) with the C speed (1.3 times BLAS 
speed) in the first regime, and the Julia speed (100 times BLAS speed) with 
the C speed (4 to 5 times BLAS speed) in the second regime. Any idea on 
where the big difference between Julia and C is coming from?

Best regards,

Jutho

Reply via email to