The following figure is part of the results of profiling a 
performance-critical piece of code:

<https://lh6.googleusercontent.com/-3dfRngVQZGo/VK3BTDrwPaI/AAAAAAAABuc/2zMSWI12e9w/s1600/profile.png>
 * The pink section corresponds to a single line:
                   G = - G * C
where G, C are moderate size N x N matrices; this line is called O(N) times.
 * brown block is `gemm_wrapper!`
 * the green block is the `-` operation in array.jl

QUESTIONS:
 - Is the `- operation` really a higher cost than the matrix 
multiplication? Why? 
 - What is happening in part of the pink block that has no block above it?
 - closely related: from which matrix-size onwards are matrix-vector 
multiplications best performed in BLAS as opposed to for-loops?

Many thanks,
    Christoph





Reply via email to