I have two versions of an example function that calculates a number by
looping over all pair of points. In the first one I use a 2d-array and
access points with [:,i] syntax to get the coordinates. In the second
version of the function I instead creates an array of Point-types (each
Point has a x and y coordinate). I then access the coordinate like point.x,
point.y etc.
These two functions takes vastly different time and memory usage.
This is the first function:
function slow()
srand(1234)
points = randn(2, 5000)
n_points::Int = size(points,2)
cum = 0.0
for i in 1:n_points
for j in (i+1):n_points
point_2 = points[:, j]
cum += point_2[1]
end
end
return cum
end
This is the fast version with the Point types:
immutable Point
x::Float64
y::Float64
end
function fast()
srand(1234)
points = randn(2, 5000)
n_points = size(points, 2)
cum= 0.0
# Create array of points
points_vec = Point[]
for i in 1:n_points
push!(points_vec, Point( points [1,i], points [2,i]))
end
for i in 1:n_points
for j in (i+1):n_points
point_2 = points_vec[j]
cum += point_2.x
end
end
return cum
end
Running
@time println(slow())
@time println(fast())
now gives:
-23952.535945302105
elapsed time: 0.954317047 seconds (1055 MB allocated, 3.78% gc time in 48
pauses with 0 full sweep)
-23952.535945302105
elapsed time: 0.025171914 seconds (1 MB allocated)
The slow version takes 50 times longer and consumes 1000x the memory.
Running the functions with memory tacker gives:
-
-
-
- function slow()
28688 srand(1234)
80048 points = randn(2, 5000)
0 n_points::Int = size(points,2)
0 cum = 0.0
0 for i in 1:n_points
0 for j in (i+1):n_points
1099780000 point_2 = points[:, j]
0 cum += point_2[1]
- end
- end
0 return cum
- end
-
-
-
- immutable Point
- x::Float64
- y::Float64
- end
-
- function fast()
2540964 srand(1234)
80048 points = randn(2, 5000)
0 n_points = size(points, 2)
0 cum= 0.0
-
- # Create array of points
48 points_vec = Point[]
0 for i in 1:n_points
263112 push!(points_vec, Point( points [1,i], points [2,i]))
- end
-
0 for i in 1:n_points
0 for j in (i+1):n_points
0 point_2 = points_vec[j]
0 cum += point_2.x
- end
- end
0 return cum
- end
-
-
-
- @time println(slow())
- @time println(fast())
-
-
So what seems to take all the memory is
point_2 = points[:, j]
Maybe some copying is performed when slicing but I have tried replacing it
with sub and slice etc (that shouldnt copy?) and it just get worse. Are
there some alignment issues?
I have tried both in 0.3.5 and 0.4 with the same results.
Any help?
Best regards,
Kristoffer Carlsson