[julia-users] Re: push! performance

Tomas Lycken Mon, 16 Nov 2015 08:25:24 -0800


Making sure that precompilation and gc don’t factor into the result, I get 
quite different timings:


julia> gc(); @time sizehint!(a, 10_000_000);
  0.001493 seconds (46 allocations: 76.296 MB, 304.61% gc time)

julia> gc(); @time b = zeros(Int, 10_000_000);
  0.021997 seconds (38 allocations: 76.296 MB, 0.70% gc time)

julia> versioninfo()
Julia Version 0.4.1
Commit cbe1bee* (2015-11-08 10:33 UTC)
Platform Info:
  System: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

The gc count on sizehint!() is obviously bogus, but the difference in 
timing is much smaller. I wouldn’t doubt, however, that there is 
performance to be gained from the fact that sizehint! doesn’t have to write 
zeros to all the memory it allocates. My point was rather to illustrate 
*when* allocation occurs, and how *usage* of the array differs depending on 
approach, than the minute implementation details.

// T

On Monday, November 16, 2015 at 4:09:54 PM UTC+1, Seth wrote:

I'm not sure the equivalence is entirely accurate:
>
> julia> a = Vector{Int}()
> 0-element Array{Int64,1}
>
> julia> Base.summarysize(a)
> 0
>
> julia> @time sizehint!(a,10_000_000)
> 0.000096 seconds (149 allocations: 76.304 MB)
> 0-element Array{Int64,1}
>
> julia> Base.summarysize(a)
> 0
>
> julia> @time b = zeros(Int,10_000_000);
> 0.037202 seconds (35 allocations: 76.295 MB, 64.13% gc time)
>
> julia> Base.summarysize(b)
> 80000000
>
>
>
> On Monday, November 16, 2015 at 1:10:36 AM UTC-8, Tomas Lycken wrote:
>>
>> sizehint! preallocates for you, with comparable cost to calling e.g. 
>> zeroes, but lets you treat the array semantically the same way as a 
>> non-preallocated one (but without the cost for reallocation). Hopefully, 
>> these comments highlight the differences between the various appraches:
>>
>> N = 10_000
>> A = Array(Float64,0) 
>> sizehint!(A, 10_000) # this preallocates memory for 10k elements
>> B = Array(Float64,0)
>> C = zeros(10_000) # this also preallocates memory for 10k elements
>>
>> # now, A and C are pre-allocated, while B is not# however, A and B are 
>> semantically equivalent (0-length) vectors,
>> # while C is already of length 10 000:
>> println(length(A)) # 0
>> println(length(B)) # 0
>> println(length(C)) # 10000
>>
>> for i in 1:10_000
>>    push!(A, i) # no reallocation happens here, because we did it with 
>> sizehint!
>>    push!(B, i) # this will re-allocate B every now and then
>>    C[i] = i # can't use push! here, but must manually track index instead
>> end
>>
>> I don't know what `dynamic` does in this context, and I can't find it in 
>> the docs, so can't help you there :)
>>
>> // T
>>
>> On Monday, November 16, 2015 at 2:07:13 AM UTC+1, Seth wrote:
>>
>> What happens if you use sizehint!() with dynamic()?
>>>
>>> On Sunday, November 15, 2015 at 3:35:45 PM UTC-8, Steven G. Johnson 
>>> wrote:
>>>>
>>>> function prealloc(n)
>>>>     a = Array(Int, n)
>>>>     for i = 1:n
>>>>         a[i] = i
>>>>     end
>>>>     return a
>>>> end
>>>> function dynamic(n)
>>>>     a = Int[]
>>>>     for i = 1:n
>>>>         push!(a, i)
>>>>     end
>>>>     return a
>>>> end
>>>> @time prealloc(10^7);
>>>> @time dynamic(10^7);
>>>>
>>>>
>>>> On my machine, the preallocated version is 2.5–3x faster.  A 
>>>> significant but not overwhelming margin.
>>>>
>>> 
>>
>

[julia-users] Re: push! performance

Reply via email to