Today while trying optimize a piece code I came across a rather curious
behavior of when allocation memory when accessing a DataArray.
x=rand(1:10,1000000);
function countGT(x::Array{Int,1})
count=0
for i=1:length(x)
count+= (x[i]>5)? 1: 0
end
count
end
Here is what you get after running @time (compilation excluded)
@time countGT(x);
elapsed time: 0.00847156 seconds (96 bytes allocated)
That is not too bad. @time at least allocated 80 bytes and the extra 16
bytes is for creating the variable "count", so far so good.
Now lets see if we do the same a floating point array.
x=rand(1000000);
function countGT(x::Array{Float64,1})
count=0.0
for i=1:length(x)
count+= (x[i]>5.0)? 1.0: 0.0
end
count
end
countGT(x)
@time countGT(x)
You get
elapsed time: 0.00177126 seconds (96 bytes allocated)
Which still pretty good. Now, the problem start to show up when I have a
DataArray
x=@data rand(1000000);
function countGT(x::DataArray{Float64,1})
count=0.0
for i=1:length(x)
count+= (x[i]>5.0)? 1.0: 0.0
end
count
end
countGT(x)
@time countGT(x)
You we get
elapsed time: 0.23610454 seconds (16000096 bytes allocated)
The bytes allocated seems to scale with the size of the DataArray. So it
seems that mere act of accessing an element in a DataArray allocates
memory.
I am wondering there could be a better way.