I'm not an experienced programmer, but I was wondering if someone is
interested in commenting on this performance difference that surprised me a
little. It's something I ran across while writing some simple simulations,
and I have made the following minimal example. If I define the following
function in a file and include it execution takes about 0.9 s:
function test()
a = [.5 2.]
N = 10000000
M = length(a)
sum = 0
tic()
for i=1:N
for j=1:M
sum = sum+a[j]
end
end
toc()
end
julia> test()
elapsed time: 0.885158777 seconds
If however, I change the second line to a = [.5 2] (no period after the
second number, so that one number is clearly real while the other is an
integer), it takes 40-50% longer:
julia> test()
elapsed time: 1.27722677 seconds
Defining a as const did not change this result. But if on the other hand,
I pass in the array a as an argument to test, there is no difference
between test([.5 2.]) and test([.5 2]). And of course typing
a = zeros(2)
a[1] = .5
a[2] = 2
does not have the performance penalty, but is inconvenient for larger
arrays. I ran into this while writing a simulation in which I defined some
parameters as arrays at the beginning of the simulation. In that case the
difference between the two cases (all values clearly reals as opposed to a
mix of reals and integers) changed the execution time changed by a factor
of 15-20.
This was somewhat surprising as I had figured that the compiler would be
able to infer that a is an array of floats in either case. As a result
I've taken to either passing in any parameters to the function or very
carefully reading my code to ensure that I'm not mixing integers and floats
in these kinds of definitions. I don't know if this performance difference
is avoidable or not, but as a naive user it did surprise me.