On Thu, Jul 14, 2016 at 12:18 PM, Stefan Karpinski <[email protected]> wrote:
> In other words the former only allocates less because the compiler can see
> that you don't do anything with the allocated memory and can skip allocated
> it at all. Returning a value isn't costly but it does force the array to
> actually be allocated so that it can be returned. Presumably in real code
> you would actually do something with the arrays you allocate so there would
> be no difference in performance.

I didn't say it that way since the issue here is actually that the use
of `a` made the type instability explicit. The compiler was able to
realize that no use of `a` was of uncertain type without the return.

>
> On Thu, Jul 14, 2016 at 10:36 AM, Yichao Yu <[email protected]> wrote:
>>
>> On Thu, Jul 14, 2016 at 9:49 AM, Kevin Squire <[email protected]>
>> wrote:
>> > (To expand on Yichao's comment: Remove the comma in both for loops)
>> >
>> >
>> > On Thursday, July 14, 2016, Yichao Yu <[email protected]> wrote:
>> >>
>> >> On Thu, Jul 14, 2016 at 6:49 AM, Michele Giugliano
>> >> <[email protected]>
>> >> wrote:
>> >> > Julia newbie here!
>> >> >
>> >> > I noticed a performance loss (by means of @benchmark from
>> >> > BenchmarkTools),
>> >> > when a function returns a value versus when it does not.
>> >> >
>> >> > Note: in the code snippet (see below) that I prepared to exemplify my
>> >> > issue,
>> >> > there's also an increase in the number of allocations - which I don't
>> >> > understand - when returning values.
>> >> >
>> >> > However, in my own function (not included here), simulating a
>> >> > mathematical
>> >> > model, there is no such a difference in the allocations but a ~5
>> >> > times
>> >> > performance degradation - as indicated by the output of @benchmark
>> >> > below:
>> >> >
>> >> > (with output returned)
>> >> >
>> >> > BenchmarkTools.Trial:
>> >> >   samples:          10000
>> >> >   evals/sample:     1
>> >> >   time tolerance:   5.00%
>> >> >   memory tolerance: 1.00%
>> >> >   memory estimate:  32.00 bytes
>> >> >   allocs estimate:  1
>> >> >   minimum time:     62.56 μs (0.00% GC)
>> >> >   median time:      62.63 μs (0.00% GC)
>> >> >   mean time:        72.77 μs (0.00% GC)
>> >> >   maximum time:     263.93 μs (0.00% GC)
>> >> >
>> >> >
>> >> >
>> >> > (without output returned)
>> >> >
>> >> > BenchmarkTools.Trial:
>> >> >
>> >> >   samples:          10000
>> >> >   evals/sample:     1
>> >> >   time tolerance:   5.00%
>> >> >   memory tolerance: 1.00%
>> >> >   memory estimate:  0.00 bytes
>> >> >   allocs estimate:  0
>> >> >   minimum time:     11.22 μs (0.00% GC)
>> >> >   median time:      13.58 μs (0.00% GC)
>> >> >   mean time:        14.19 μs (0.00% GC)
>> >> >   maximum time:     119.73 μs (0.00% GC)
>> >> >
>> >> >
>> >> >
>> >> > Is any gentle soul out there, patient enough and willing to explain
>> >> > whether
>> >> > this might be a Julia's bug, or whether it is my brain's bug... ?
>> >> >
>>
>> Also, returning a value changes the performance is totally possible
>> since the compiler is allowed to do much more optimizations if some
>> value is not used.
>>
>> >> >
>> >> > The code snippet is pasted below, including the output of @benchmark:
>> >> >
>> >> > function test1()
>> >> >     a = rand(10000)
>> >> >     for k=1:10,
>> >> >         a = rand(10000)
>> >> >     end
>> >> > end
>> >> >
>> >> >
>> >> > function test2()
>> >> >     a = rand(10000)
>> >> >     for k=1:10,
>> >> >         a = rand(10000)
>> >>
>> >> Note that you are looping a over the array.
>> >>
>> >> >     end
>> >> >     a[1]
>> >> > end
>> >> >
>> >> >
>> >> >> @benchmark test1() # without output returned
>> >> >
>> >> > BenchmarkTools.Trial:
>> >> >   samples:          10000
>> >> >   evals/sample:     1
>> >> >   time tolerance:   5.00%
>> >> >   memory tolerance: 1.00%
>> >> >   memory estimate:  860.41 kb
>> >> >   allocs estimate:  44
>> >> >   minimum time:     210.01 μs (0.00% GC)
>> >> >   median time:      292.29 μs (0.00% GC)
>> >> >   mean time:        483.24 μs (20.95% GC)
>> >> >   maximum time:     26.80 ms (93.95% GC)
>> >> >
>> >> >
>> >> >
>> >> >> @benchmark test2() # with output returned
>> >> >
>> >> > BenchmarkTools.Trial:
>> >> >   samples:          3398
>> >> >   evals/sample:     1
>> >> >   time tolerance:   5.00%
>> >> >   memory tolerance: 1.00%
>> >> >   memory estimate:  2.37 mb
>> >> >   allocs estimate:  100045
>> >> >   minimum time:     704.49 μs (0.00% GC)
>> >> >   median time:      896.99 μs (0.00% GC)
>> >> >   mean time:        1.46 ms (17.55% GC)
>> >> >   maximum time:     37.16 ms (36.87% GC)
>> >> >
>> >> >
>> >> > Thank you all.
>
>

Reply via email to