Dear Yichao, Thanks very much for the prompt response. This question arises regarding a code for finite element stiffness matrix assembly. This computation involves an outer loop over elements (possibly millions of them). Inside this loop is a sequence of operations on 'small' vectors and matrices (say 3-by-3 matrices). These inner operations are mostly gaxpy and indirect addressing. The code would be much more readable if all of these small-vector operations are written without explicit loops for i=1:3, but it seems that replacing loops with [:] causes major heap allocation. Is there a macro package or other solution applicable to this form of computation?
Thanks, Steve Vavasis On Thursday, May 19, 2016 at 9:47:12 PM UTC-4, [email protected] wrote: > > The two functions test4 and test5 below are equivalent, but test5 is much > faster than test4. Apparently test4 is carrying out a heap allocation on > each iteration of the j-loop. Why? In general, which kinds of assignment > statements of the form <array>=<array> create temporaries, and which don't? > (In the example below, if the indirect addressing via array i is > eliminated, then the two functions have comparable performance.) > > Thanks, > Steve Vavasis > > function test4(n) > y = [2.0, 6.0, 3.0] > i = [1, 2, 3] > z = [0.0, 0.0, 0.0] > u = 0.0 > for j = 1 : n > z[:] = y[i] > u += sum(z) > end > u > end > > function test5(n) > y = [2.0, 6.0, 3.0] > i = [1, 2, 3] > z = [0.0, 0.0, 0.0] > u = 0.0 > for j = 1 : n > for k = 1 : 3 > z[k] = y[i[k]] > end > u += sum(z) > end > u > end > > > julia> @time Testmv.test4(10000000) > 1.071396 seconds (20.00 M allocations: 1.192 GB, 7.03% gc time) > 1.1e8 > > julia> @time Testmv.test5(10000000) > 0.184411 seconds (4.61 k allocations: 198.072 KB) > 1.1e8 > >
