It's the other way around. .* won't fuse because it's still an operator. .= will. It you want .* to fuse, you can instead do:
A .= *.(A,B) since this invokes the broadcast on *, instead of invoking .*. But that's just a temporary thing. On Tuesday, November 1, 2016 at 7:27:40 PM UTC-7, Tom Breloff wrote: > > As I understand it, the .* will fuse, but the .= will not (until 0.6?), so > A will be rebound to a newly allocated array. If my understanding is wrong > I'd love to know. There have been many times in the last few days that I > would have used it... > > On Tue, Nov 1, 2016 at 10:06 PM, Sheehan Olver <dlfiv...@gmail.com > <javascript:>> wrote: > >> Ah, good point. Though I guess that won't work til 0.6 since .* won't >> auto-fuse yet? >> >> Sent from my iPhone >> >> On 2 Nov. 2016, at 12:55, Chris Rackauckas <rack...@gmail.com >> <javascript:>> wrote: >> >> This is pretty much obsolete by the . fusing changes: >> >> A .= A.*B >> >> should be an in-place update of A scaled by B (Tomas' solution). >> >> On Tuesday, November 1, 2016 at 4:39:15 PM UTC-7, Sheehan Olver wrote: >>> >>> Should this be added to a package? I imagine if the arrays are on the >>> GPU (AFArrays) then the operation could be much faster, and having a >>> consistent name would be helpful. >>> >>> >>> On Wednesday, October 7, 2015 at 1:28:29 AM UTC+11, Lionel du Peloux >>> wrote: >>>> >>>> Dear all, >>>> >>>> I'm looking for the fastest way to do element-wise vector >>>> multiplication in Julia. The best I could have done is the following >>>> implementation which still runs 1.5x slower than the dot product. I assume >>>> the dot product would include such an operation ... and then do a >>>> cumulative sum over the element-wise product. >>>> >>>> The MKL lib includes such an operation (v?Mul) but it seems OpenBLAS >>>> does not. So my question is : >>>> >>>> 1) is there any chance I can do vector element-wise multiplication >>>> faster then the actual dot product ? >>>> 2) why the built-in element-wise multiplication operator (*.) is much >>>> slower than my own implementation for such a basic linealg operation (full >>>> julia) ? >>>> >>>> Thank you, >>>> Lionel >>>> >>>> Best custom implementation : >>>> >>>> function xpy!{T<:Number}(A::Vector{T},B::Vector{T}) >>>> n = size(A)[1] >>>> if n == size(B)[1] >>>> for i=1:n >>>> @inbounds A[i] *= B[i] >>>> end >>>> end >>>> return A >>>> end >>>> >>>> Bench mark results (JuliaBox, A = randn(300000) : >>>> >>>> function CPU (s) GC (%) ALLOCATION (bytes) >>>> CPU (x) >>>> dot(A,B) 1.58e-04 0.00 16 >>>> 1.0 xpy!(A,B) 2.31e-04 0.00 80 >>>> 1.5 >>>> NumericExtensions.multiply!(P,Q) 3.60e-04 0.00 80 >>>> 2.3 xpy!(A,B) - no @inbounds check 4.36e-04 0.00 80 >>>> 2.8 >>>> P.*Q 2.52e-03 50.36 2400512 >>>> 16.0 >>>> ############################################################ >>>> >>>> >