I'm not an experienced programmer, but I was wondering if someone is 
interested in commenting on this performance difference that surprised me a 
little.  It's something I ran across while writing some simple simulations, 
and I have made the following minimal example.  If I define the following 
function in a file and include it execution takes about 0.9 s:

function test()
    a = [.5 2.]
    N = 10000000
    M = length(a)

    sum = 0
    tic()
    for i=1:N
        for j=1:M
            sum = sum+a[j]
        end
    end
    toc()
end

julia> test()
elapsed time: 0.885158777 seconds

If however, I change the second line to a = [.5 2] (no period after the 
second number, so that one number is clearly real while the other is an 
integer), it takes 40-50% longer:

julia> test()
elapsed time: 1.27722677 seconds

Defining a as const did not change this result.  But if on the other hand, 
I pass in the array a as an argument to test, there is no difference 
between test([.5 2.]) and test([.5 2]).  And of course typing

a = zeros(2)
a[1] = .5
a[2] = 2

does not have the performance penalty, but is inconvenient for larger 
arrays.  I ran into this while writing a simulation in which I defined some 
parameters as arrays at the beginning of the simulation.  In that case the 
difference between the two cases (all values clearly reals as opposed to a 
mix of reals and integers) changed the execution time changed by a factor 
of 15-20.

This was somewhat surprising as I had figured that the compiler would be 
able to infer that a is an array of floats in either case.  As a result 
I've taken to either passing in any parameters to the function or very 
carefully reading my code to ensure that I'm not mixing integers and floats 
in these kinds of definitions.  I don't know if this performance difference 
is avoidable or not, but as a naive user it did surprise me.  

Reply via email to