I think I am misunderstanding the temporary array allocation process. Is it allocating one or two temp arrays? Where have I gone wrong here:
tmp = 2y (allocates a temporary array to store result) tmp .-= 4z (also allocates a temporary array for 4z? Why not just use z directly, thus tmp[i] = tmp[i] - 4*z[i] ) tmp ./= w (Uses previous temp array and w to do the division overwriting tmp, i.e. loops over tmp[i] = tmp[i]/w[i] ) x .+= tmp (performs x[i] = x[i] + tmp[i] ) On Friday, December 18, 2015 at 1:53:02 PM UTC-5, Steven G. Johnson wrote: > > > > On Friday, December 18, 2015 at 1:32:16 PM UTC-5, Ethan Anderes wrote: >> >> Ok, thanks for the info (and @inbounds does improve it a bit). I usually >> follow your advice and fuse the operations together when I need the speed, >> but since I do all manner of combinations of vectorized operations >> throughout my module I tend to prefer using .*=, ./=, etc unless I need >> it. >> > Having "all manner of combinations" of these operations is a good reason > *not* to define in-place versions of these operations. For example, > imagine the computation: > > x = x + (2y - 4z) ./ w > > > with your proposed in-place assignment operations, I guess this would > become: > > tmp = 2y > tmp .-= 4z > tmp ./= w > x .+= tmp > > > which still allocates two temporary arrays (one for tmp and one for 4z), > and involves five separate loops. Compare to: > > for i in eachindex(x) > x[i] += (2y[i] - 4z[i]) / w[i] > end > > > which involves only one loop (and probably better cache performance as a > result) and no temporary arrays. (You can add @inbounds if you want a bit > more performance and know that w/x/y/z have the same shape.) Not only is > it more efficient than a sequence of in-place assignments, but I would > argue that it is much more readable as well, despite the need for an > explicit loop. > > Alternatively, you can use the Devectorize package, and something like > > @devec x[:] = x + (2y - 4z) ./ w > > > will basically do the same thing as the loop if I understand @devec > correctly. >
