Making sure that precompilation and gc don’t factor into the result, I get
quite different timings:
julia> gc(); @time sizehint!(a, 10_000_000);
0.001493 seconds (46 allocations: 76.296 MB, 304.61% gc time)
julia> gc(); @time b = zeros(Int, 10_000_000);
0.021997 seconds (38 allocations: 76.296 MB, 0.70% gc time)
julia> versioninfo()
Julia Version 0.4.1
Commit cbe1bee* (2015-11-08 10:33 UTC)
Platform Info:
System: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.3
The gc count on sizehint!() is obviously bogus, but the difference in
timing is much smaller. I wouldn’t doubt, however, that there is
performance to be gained from the fact that sizehint! doesn’t have to write
zeros to all the memory it allocates. My point was rather to illustrate
*when* allocation occurs, and how *usage* of the array differs depending on
approach, than the minute implementation details.
// T
On Monday, November 16, 2015 at 4:09:54 PM UTC+1, Seth wrote:
I'm not sure the equivalence is entirely accurate:
>
> julia> a = Vector{Int}()
> 0-element Array{Int64,1}
>
> julia> Base.summarysize(a)
> 0
>
> julia> @time sizehint!(a,10_000_000)
> 0.000096 seconds (149 allocations: 76.304 MB)
> 0-element Array{Int64,1}
>
> julia> Base.summarysize(a)
> 0
>
> julia> @time b = zeros(Int,10_000_000);
> 0.037202 seconds (35 allocations: 76.295 MB, 64.13% gc time)
>
> julia> Base.summarysize(b)
> 80000000
>
>
>
> On Monday, November 16, 2015 at 1:10:36 AM UTC-8, Tomas Lycken wrote:
>>
>> sizehint! preallocates for you, with comparable cost to calling e.g.
>> zeroes, but lets you treat the array semantically the same way as a
>> non-preallocated one (but without the cost for reallocation). Hopefully,
>> these comments highlight the differences between the various appraches:
>>
>> N = 10_000
>> A = Array(Float64,0)
>> sizehint!(A, 10_000) # this preallocates memory for 10k elements
>> B = Array(Float64,0)
>> C = zeros(10_000) # this also preallocates memory for 10k elements
>>
>> # now, A and C are pre-allocated, while B is not# however, A and B are
>> semantically equivalent (0-length) vectors,
>> # while C is already of length 10 000:
>> println(length(A)) # 0
>> println(length(B)) # 0
>> println(length(C)) # 10000
>>
>> for i in 1:10_000
>> push!(A, i) # no reallocation happens here, because we did it with
>> sizehint!
>> push!(B, i) # this will re-allocate B every now and then
>> C[i] = i # can't use push! here, but must manually track index instead
>> end
>>
>> I don't know what `dynamic` does in this context, and I can't find it in
>> the docs, so can't help you there :)
>>
>> // T
>>
>> On Monday, November 16, 2015 at 2:07:13 AM UTC+1, Seth wrote:
>>
>> What happens if you use sizehint!() with dynamic()?
>>>
>>> On Sunday, November 15, 2015 at 3:35:45 PM UTC-8, Steven G. Johnson
>>> wrote:
>>>>
>>>> function prealloc(n)
>>>> a = Array(Int, n)
>>>> for i = 1:n
>>>> a[i] = i
>>>> end
>>>> return a
>>>> end
>>>> function dynamic(n)
>>>> a = Int[]
>>>> for i = 1:n
>>>> push!(a, i)
>>>> end
>>>> return a
>>>> end
>>>> @time prealloc(10^7);
>>>> @time dynamic(10^7);
>>>>
>>>>
>>>> On my machine, the preallocated version is 2.5–3x faster. A
>>>> significant but not overwhelming margin.
>>>>
>>>
>>
>