I have been trying to use SubArrays to reduce the number of matrix copies 
that my code is doing, but I've been surprised to find that replacing 
slices and extra copies with SubArrays seems to slow things down.

My first question is what is the difference between the "+" operator and 
the (AFAIK undocumented) ".+" operator for SubArrays?  .+ seems to be about 
3x faster, although the output seems to be the same.  (img is a 2048x3072 
Array{Float32,2}).

julia> @time result_no_dot = sub(img,:,1:3072-2) + sub(img,:,3:3072)
elapsed time: 0.182192137 seconds (25151376 bytes allocated)

julia> @time result_with_dot = sub(img,:,1:3072-2) .+ sub(img,:,3:3072)
elapsed time: 0.060138815 seconds (25152360 bytes allocated)

julia> result_no_dot == result_with_dot
true

I also don't understand what the profiler is telling me.  With the "+" 
operator I get:

9  multidimensional.jl; getindex; line: 119
1  multidimensional.jl; getindex; line: 120
83 multidimensional.jl; getindex; line: 123
15 multidimensional.jl; getindex; line: 124
2  multidimensional.jl; getindex; line: 125
26 multidimensional.jl; getindex; line: 127
              30 profile.jl; anonymous; line: 14
                21 array.jl; +; line: 756
                7  multidimensional.jl; getindex; line: 119

While with the .+ operator I get the much clearer (to me):

3  abstractarray.jl; size; line: 9
              53 profile.jl; anonymous; line: 14
                53 broadcast.jl; .+; line: 148
                  12 broadcast.jl; broadcast!; line: 92
                  41 broadcast.jl; broadcast!; line: 101
                      3  abstractarray.jl; size; line: 9
                      19 broadcast.jl; _F_; line: 71
                      9  broadcast.jl; _F_; line: 72
                      1  broadcast.jl; _F_; line: 256
                      9  broadcast.jl; _F_; line: 257



My second question is, is there a way to avoid doing any allocation at all 
(short of de-vectorizing)?  Even when I preallocate an array for the 
result, the "+" and ".+" operators (or maybe the assignment operator) seem 
to create an extra copy of the array:

julia> my_result = Array(Float32, 2048, 3070)
julia> @time my_result = sub(img,:,1:3072-2) .+ sub(img,:,3:3072)
elapsed time: 0.059179547 seconds (25152360 bytes allocated)

I can't seem to get that extra 25Mbytes of allocation to go away.


My third question is what is the slicing operator doing that makes it so 
much faster than the sub operator, even though it has to make an extra copy?

julia> @time result_with_slicing = img[:,1:3072-2] + img[:,3:3072]
elapsed time: 0.029242747 seconds (75448992 bytes allocated)

That's another 2x faster than subarrays with the ".+" operator, but 
allocates 75M instead of 25M.

And the profile is:
              20 profile.jl; anonymous; line: 14
                2  array.jl; +; line: 754
                7  array.jl; +; line: 756
                11 multidimensional.jl; getindex; line: 47
                 11 multidimensional.jl; _getindex!; line: 33

Thanks,
-Matt

Reply via email to