Dear Yichao,

Thanks very much for the prompt response.  This question arises regarding a 
code for finite element stiffness matrix assembly.  This computation 
involves an outer loop over elements (possibly millions of them).  Inside 
this loop is a sequence of operations on 'small' vectors and matrices (say 
3-by-3 matrices). These inner operations are mostly gaxpy and indirect 
addressing.  The code would be much more readable if all of these 
small-vector operations are written without explicit loops for i=1:3, but 
it seems that replacing loops with [:] causes major heap allocation.  Is 
there a macro package or other solution applicable to this form of 
computation? 

Thanks,
Steve Vavasis


 



On Thursday, May 19, 2016 at 9:47:12 PM UTC-4, [email protected] wrote:
>
> The two functions test4 and test5 below are equivalent, but test5 is much 
> faster than test4.  Apparently test4 is carrying out a heap allocation on 
> each iteration of the j-loop.  Why?   In general, which kinds of assignment 
> statements of the form <array>=<array> create temporaries, and which don't? 
>  (In the example below, if the indirect addressing via array i is 
> eliminated, then the two functions have comparable performance.)
>
> Thanks,
> Steve Vavasis
>
> function test4(n)
>     y = [2.0, 6.0, 3.0]
>     i = [1, 2, 3]
>     z = [0.0, 0.0, 0.0]
>     u = 0.0
>     for j = 1 : n
>         z[:] = y[i]
>         u += sum(z)
>     end
>     u
> end    
>
> function test5(n)
>     y = [2.0, 6.0, 3.0]
>     i = [1, 2, 3]
>     z = [0.0, 0.0, 0.0]
>     u = 0.0
>     for j = 1 : n
>         for k = 1 : 3
>             z[k] = y[i[k]]
>         end
>         u += sum(z)
>     end
>     u
> end    
>
>
> julia> @time Testmv.test4(10000000)
>   1.071396 seconds (20.00 M allocations: 1.192 GB, 7.03% gc time)
> 1.1e8
>
> julia> @time Testmv.test5(10000000)
>   0.184411 seconds (4.61 k allocations: 198.072 KB)
> 1.1e8
>
>

Reply via email to