In other words the former only allocates less because the compiler can see that you don't do anything with the allocated memory and can skip allocated it at all. Returning a value isn't costly but it does force the array to actually be allocated so that it can be returned. Presumably in real code you would actually do something with the arrays you allocate so there would be no difference in performance.
On Thu, Jul 14, 2016 at 10:36 AM, Yichao Yu <[email protected]> wrote: > On Thu, Jul 14, 2016 at 9:49 AM, Kevin Squire <[email protected]> > wrote: > > (To expand on Yichao's comment: Remove the comma in both for loops) > > > > > > On Thursday, July 14, 2016, Yichao Yu <[email protected]> wrote: > >> > >> On Thu, Jul 14, 2016 at 6:49 AM, Michele Giugliano < > [email protected]> > >> wrote: > >> > Julia newbie here! > >> > > >> > I noticed a performance loss (by means of @benchmark from > >> > BenchmarkTools), > >> > when a function returns a value versus when it does not. > >> > > >> > Note: in the code snippet (see below) that I prepared to exemplify my > >> > issue, > >> > there's also an increase in the number of allocations - which I don't > >> > understand - when returning values. > >> > > >> > However, in my own function (not included here), simulating a > >> > mathematical > >> > model, there is no such a difference in the allocations but a ~5 times > >> > performance degradation - as indicated by the output of @benchmark > >> > below: > >> > > >> > (with output returned) > >> > > >> > BenchmarkTools.Trial: > >> > samples: 10000 > >> > evals/sample: 1 > >> > time tolerance: 5.00% > >> > memory tolerance: 1.00% > >> > memory estimate: 32.00 bytes > >> > allocs estimate: 1 > >> > minimum time: 62.56 μs (0.00% GC) > >> > median time: 62.63 μs (0.00% GC) > >> > mean time: 72.77 μs (0.00% GC) > >> > maximum time: 263.93 μs (0.00% GC) > >> > > >> > > >> > > >> > (without output returned) > >> > > >> > BenchmarkTools.Trial: > >> > > >> > samples: 10000 > >> > evals/sample: 1 > >> > time tolerance: 5.00% > >> > memory tolerance: 1.00% > >> > memory estimate: 0.00 bytes > >> > allocs estimate: 0 > >> > minimum time: 11.22 μs (0.00% GC) > >> > median time: 13.58 μs (0.00% GC) > >> > mean time: 14.19 μs (0.00% GC) > >> > maximum time: 119.73 μs (0.00% GC) > >> > > >> > > >> > > >> > Is any gentle soul out there, patient enough and willing to explain > >> > whether > >> > this might be a Julia's bug, or whether it is my brain's bug... ? > >> > > > Also, returning a value changes the performance is totally possible > since the compiler is allowed to do much more optimizations if some > value is not used. > > >> > > >> > The code snippet is pasted below, including the output of @benchmark: > >> > > >> > function test1() > >> > a = rand(10000) > >> > for k=1:10, > >> > a = rand(10000) > >> > end > >> > end > >> > > >> > > >> > function test2() > >> > a = rand(10000) > >> > for k=1:10, > >> > a = rand(10000) > >> > >> Note that you are looping a over the array. > >> > >> > end > >> > a[1] > >> > end > >> > > >> > > >> >> @benchmark test1() # without output returned > >> > > >> > BenchmarkTools.Trial: > >> > samples: 10000 > >> > evals/sample: 1 > >> > time tolerance: 5.00% > >> > memory tolerance: 1.00% > >> > memory estimate: 860.41 kb > >> > allocs estimate: 44 > >> > minimum time: 210.01 μs (0.00% GC) > >> > median time: 292.29 μs (0.00% GC) > >> > mean time: 483.24 μs (20.95% GC) > >> > maximum time: 26.80 ms (93.95% GC) > >> > > >> > > >> > > >> >> @benchmark test2() # with output returned > >> > > >> > BenchmarkTools.Trial: > >> > samples: 3398 > >> > evals/sample: 1 > >> > time tolerance: 5.00% > >> > memory tolerance: 1.00% > >> > memory estimate: 2.37 mb > >> > allocs estimate: 100045 > >> > minimum time: 704.49 μs (0.00% GC) > >> > median time: 896.99 μs (0.00% GC) > >> > mean time: 1.46 ms (17.55% GC) > >> > maximum time: 37.16 ms (36.87% GC) > >> > > >> > > >> > Thank you all. >
