I think that for medium size (but not large) arrays in v0.5 you may want to use @threads from the threadding branch, and then for really large arrays you may want to use @parallel. But you'd have to test some timings.
On Monday, June 20, 2016 at 11:38:15 AM UTC+1, [email protected] wrote: > > I have the same question regarding how to calculate the entry-wise vector > product and find this thread. As a novice, I wonder if the following code > snippet is still the standard for entry-wise vector multiplication that one > should stick to in practice? Thanks! > > > @fastmath @inbounds @simd for i=1:n > A[i] *= B[i] > end > > > > On Tuesday, October 6, 2015 at 3:28:29 PM UTC+1, Lionel du Peloux wrote: >> >> Dear all, >> >> I'm looking for the fastest way to do element-wise vector multiplication >> in Julia. The best I could have done is the following implementation which >> still runs 1.5x slower than the dot product. I assume the dot product would >> include such an operation ... and then do a cumulative sum over the >> element-wise product. >> >> The MKL lib includes such an operation (v?Mul) but it seems OpenBLAS does >> not. So my question is : >> >> 1) is there any chance I can do vector element-wise multiplication faster >> then the actual dot product ? >> 2) why the built-in element-wise multiplication operator (*.) is much >> slower than my own implementation for such a basic linealg operation (full >> julia) ? >> >> Thank you, >> Lionel >> >> Best custom implementation : >> >> function xpy!{T<:Number}(A::Vector{T},B::Vector{T}) >> n = size(A)[1] >> if n == size(B)[1] >> for i=1:n >> @inbounds A[i] *= B[i] >> end >> end >> return A >> end >> >> Bench mark results (JuliaBox, A = randn(300000) : >> >> function CPU (s) GC (%) ALLOCATION (bytes) >> CPU (x) >> dot(A,B) 1.58e-04 0.00 16 >> 1.0 xpy!(A,B) 2.31e-04 0.00 80 >> 1.5 >> NumericExtensions.multiply!(P,Q) 3.60e-04 0.00 80 >> 2.3 xpy!(A,B) - no @inbounds check 4.36e-04 0.00 80 >> 2.8 >> P.*Q 2.52e-03 50.36 2400512 >> 16.0 >> ############################################################ >> >>
