SubArrays are immutable on 0.4. But tuples aren't inlined, which is going to
force allocation.
Assuming you're using 0.3, there's a second problem: the code in the
constructor is not type-stable, and that makes construction slow and memory-
hungry. Compare the following on 0.3 and 0.4:
julia> A = rand(2,10^4);
julia> function myfun(A)
s = 0.0
for j = 1:size(A,2)
S = slice(A, :, j)
s += sum(S)
end
s
end
myfun (generic function with 1 method)
On 0.3:
# warmup call
julia> @time myfun(A)
elapsed time: 0.145141435 seconds (11277536 bytes allocated)
# the real call
julia> @time myfun(A)
elapsed time: 0.034556106 seconds (7866896 bytes allocated)
On 0.4:
julia> @time myfun(A)
elapsed time: 0.190744146 seconds (7 MB allocated)
julia> @time myfun(A)
elapsed time: 0.000697173 seconds (1 MB allocated)
So you can see it's about 50x faster and about 8-fold more memory efficient on
0.4. Once Jeff finishes his tuple overhaul, the allocation on 0.4 could
potentially drop to 0.
--Tim
On Wednesday, March 25, 2015 11:18:08 AM Sebastian Good wrote:
> I was surprised by two things in the SubArray implementation
>
> 1) They are big! About 175 bytes for a simple subset from a 1D array from
> my naive measurement.[*]
> 2) They are not flat. That is, they seem to get heap allocated and have
> indirections in them.
>
> I'm guessing this is because SubArrays aren't immutable, and tuples aren't
> always inlined into an immutable either, but I am really grasping at straws.
>
> I'm walking through a very large memory mapped structure and generating
> hundreds of thousands of subarrays to look at various windows of it. I was
> hoping that by using views I would reduce memory usage as compared with
> creating copies of those windows. Indeed I am, but by a lot less than I
> thought I would be.
>
> In other words: SubArrays are surprisingly expensive because they
> necessitate several memory allocations apiece.
>
> From the work that's gone into SubArrays I'm guessing that isn't meant to
> be. They are so carefully specialized that I would expect them to behave
> roughly like a (largish) struct in common use.
>
> Is this a misconception? Do I need to take more care about how I
> parameterize the container I put them in to take advantage?
>
> [*]
>
> > const b = [1:5;]
> > function f()
>
> for i in 1:1_000_000 sub(b, 1:2) end
> end
>
> > @time f()
>
> elapsed time: 0.071933306 seconds (175 MB allocated, 9.21% gc time in 8
> pauses with 0 full sweep)