On Thu, Jul 14, 2016 at 12:18 PM, Stefan Karpinski <[email protected]> wrote: > In other words the former only allocates less because the compiler can see > that you don't do anything with the allocated memory and can skip allocated > it at all. Returning a value isn't costly but it does force the array to > actually be allocated so that it can be returned. Presumably in real code > you would actually do something with the arrays you allocate so there would > be no difference in performance.
I didn't say it that way since the issue here is actually that the use of `a` made the type instability explicit. The compiler was able to realize that no use of `a` was of uncertain type without the return. > > On Thu, Jul 14, 2016 at 10:36 AM, Yichao Yu <[email protected]> wrote: >> >> On Thu, Jul 14, 2016 at 9:49 AM, Kevin Squire <[email protected]> >> wrote: >> > (To expand on Yichao's comment: Remove the comma in both for loops) >> > >> > >> > On Thursday, July 14, 2016, Yichao Yu <[email protected]> wrote: >> >> >> >> On Thu, Jul 14, 2016 at 6:49 AM, Michele Giugliano >> >> <[email protected]> >> >> wrote: >> >> > Julia newbie here! >> >> > >> >> > I noticed a performance loss (by means of @benchmark from >> >> > BenchmarkTools), >> >> > when a function returns a value versus when it does not. >> >> > >> >> > Note: in the code snippet (see below) that I prepared to exemplify my >> >> > issue, >> >> > there's also an increase in the number of allocations - which I don't >> >> > understand - when returning values. >> >> > >> >> > However, in my own function (not included here), simulating a >> >> > mathematical >> >> > model, there is no such a difference in the allocations but a ~5 >> >> > times >> >> > performance degradation - as indicated by the output of @benchmark >> >> > below: >> >> > >> >> > (with output returned) >> >> > >> >> > BenchmarkTools.Trial: >> >> > samples: 10000 >> >> > evals/sample: 1 >> >> > time tolerance: 5.00% >> >> > memory tolerance: 1.00% >> >> > memory estimate: 32.00 bytes >> >> > allocs estimate: 1 >> >> > minimum time: 62.56 μs (0.00% GC) >> >> > median time: 62.63 μs (0.00% GC) >> >> > mean time: 72.77 μs (0.00% GC) >> >> > maximum time: 263.93 μs (0.00% GC) >> >> > >> >> > >> >> > >> >> > (without output returned) >> >> > >> >> > BenchmarkTools.Trial: >> >> > >> >> > samples: 10000 >> >> > evals/sample: 1 >> >> > time tolerance: 5.00% >> >> > memory tolerance: 1.00% >> >> > memory estimate: 0.00 bytes >> >> > allocs estimate: 0 >> >> > minimum time: 11.22 μs (0.00% GC) >> >> > median time: 13.58 μs (0.00% GC) >> >> > mean time: 14.19 μs (0.00% GC) >> >> > maximum time: 119.73 μs (0.00% GC) >> >> > >> >> > >> >> > >> >> > Is any gentle soul out there, patient enough and willing to explain >> >> > whether >> >> > this might be a Julia's bug, or whether it is my brain's bug... ? >> >> > >> >> Also, returning a value changes the performance is totally possible >> since the compiler is allowed to do much more optimizations if some >> value is not used. >> >> >> > >> >> > The code snippet is pasted below, including the output of @benchmark: >> >> > >> >> > function test1() >> >> > a = rand(10000) >> >> > for k=1:10, >> >> > a = rand(10000) >> >> > end >> >> > end >> >> > >> >> > >> >> > function test2() >> >> > a = rand(10000) >> >> > for k=1:10, >> >> > a = rand(10000) >> >> >> >> Note that you are looping a over the array. >> >> >> >> > end >> >> > a[1] >> >> > end >> >> > >> >> > >> >> >> @benchmark test1() # without output returned >> >> > >> >> > BenchmarkTools.Trial: >> >> > samples: 10000 >> >> > evals/sample: 1 >> >> > time tolerance: 5.00% >> >> > memory tolerance: 1.00% >> >> > memory estimate: 860.41 kb >> >> > allocs estimate: 44 >> >> > minimum time: 210.01 μs (0.00% GC) >> >> > median time: 292.29 μs (0.00% GC) >> >> > mean time: 483.24 μs (20.95% GC) >> >> > maximum time: 26.80 ms (93.95% GC) >> >> > >> >> > >> >> > >> >> >> @benchmark test2() # with output returned >> >> > >> >> > BenchmarkTools.Trial: >> >> > samples: 3398 >> >> > evals/sample: 1 >> >> > time tolerance: 5.00% >> >> > memory tolerance: 1.00% >> >> > memory estimate: 2.37 mb >> >> > allocs estimate: 100045 >> >> > minimum time: 704.49 μs (0.00% GC) >> >> > median time: 896.99 μs (0.00% GC) >> >> > mean time: 1.46 ms (17.55% GC) >> >> > maximum time: 37.16 ms (36.87% GC) >> >> > >> >> > >> >> > Thank you all. > >
