On Friday, December 18, 2015 at 1:32:16 PM UTC-5, Ethan Anderes wrote:
>
> Ok, thanks for the info (and @inbounds does improve it a bit). I usually 
> follow your advice and fuse the operations together when I need the speed, 
> but since I do all manner of combinations of vectorized operations 
> throughout my module I tend to prefer using .*=, ./=, etc unless I need 
> it.
>
Having "all manner of combinations" of these operations is a good reason 
*not* to define in-place versions of these operations.  For example, 
imagine the computation:

x = x + (2y - 4z) ./ w


with your proposed in-place assignment operations, I guess this would 
become:

tmp = 2y
tmp .-= 4z
tmp ./= w
x .+= tmp


which still allocates two temporary arrays (one for tmp and one for 4z), 
and involves five separate loops.  Compare to:

for i in eachindex(x)
    x[i] += (2y[i] - 4z[i]) / w[i]
end


which involves only one loop (and probably better cache performance as a 
result) and no temporary arrays.  (You can add @inbounds if you want a bit 
more performance and know that w/x/y/z have the same shape.)  Not only is 
it more efficient than a sequence of in-place assignments, but I would 
argue that it is much more readable as well, despite the need for an 
explicit loop.

Alternatively, you can use the Devectorize package, and something like

@devec x[:] = x + (2y - 4z) ./ w


will basically do the same thing as the loop if I understand @devec 
correctly.

Reply via email to