OK, good to know. I think putting the function in a package is overkill.





> On 2 Nov. 2016, at 6:35 pm, Chris Rackauckas <rackd...@gmail.com> wrote:
> 
> Yes, this most likely won't help for GPU arrays because you likely don't want 
> to be looping through elements serially: you want to call a vectorized GPU 
> function which will do the computation in parallel on the GPU. ArrayFire's 
> mathematical operations are already overloaded to do this, but I don't think 
> they can fuse.
> 
> On Tuesday, November 1, 2016 at 8:06:12 PM UTC-7, Sheehan Olver wrote:
> Ah thanks!
> 
> Though I guess if I want the same code to work also on a GPU array then this 
> won't help?
> 
> Sent from my iPhone
> 
> On 2 Nov. 2016, at 13:51, Chris Rackauckas <rack...@gmail.com <javascript:>> 
> wrote:
> 
>> It's the other way around. .* won't fuse because it's still an operator. .= 
>> will. It you want .* to fuse, you can instead do:
>> 
>> A .= *.(A,B)
>> 
>> since this invokes the broadcast on *, instead of invoking .*. But that's 
>> just a temporary thing.
>> 
>> On Tuesday, November 1, 2016 at 7:27:40 PM UTC-7, Tom Breloff wrote:
>> As I understand it, the .* will fuse, but the .= will not (until 0.6?), so A 
>> will be rebound to a newly allocated array.  If my understanding is wrong 
>> I'd love to know.  There have been many times in the last few days that I 
>> would have used it...
>> 
>> On Tue, Nov 1, 2016 at 10:06 PM, Sheehan Olver <dlfiv...@gmail.com <>> wrote:
>> Ah, good point.  Though I guess that won't work til 0.6 since .* won't 
>> auto-fuse yet? 
>> 
>> Sent from my iPhone
>> 
>> On 2 Nov. 2016, at 12:55, Chris Rackauckas <rack...@gmail.com <>> wrote:
>> 
>>> This is pretty much obsolete by the . fusing changes:
>>> 
>>> A .= A.*B
>>> 
>>> should be an in-place update of A scaled by B (Tomas' solution).
>>> 
>>> On Tuesday, November 1, 2016 at 4:39:15 PM UTC-7, Sheehan Olver wrote:
>>> Should this be added to a package?  I imagine if the arrays are on the GPU 
>>> (AFArrays) then the operation could be much faster, and having a consistent 
>>> name would be helpful.
>>> 
>>> 
>>> On Wednesday, October 7, 2015 at 1:28:29 AM UTC+11, Lionel du Peloux wrote:
>>> Dear all,
>>> 
>>> I'm looking for the fastest way to do element-wise vector multiplication in 
>>> Julia. The best I could have done is the following implementation which 
>>> still runs 1.5x slower than the dot product. I assume the dot product would 
>>> include such an operation ... and then do a cumulative sum over the 
>>> element-wise product.
>>> 
>>> The MKL lib includes such an operation (v?Mul) but it seems OpenBLAS does 
>>> not. So my question is :
>>> 
>>> 1) is there any chance I can do vector element-wise multiplication faster 
>>> then the actual dot product ?
>>> 2) why the built-in element-wise multiplication operator (*.) is much 
>>> slower than my own implementation for such a basic linealg operation (full 
>>> julia) ? 
>>> 
>>> Thank you,
>>> Lionel
>>> 
>>> Best custom implementation :
>>> 
>>> function xpy!{T<:Number}(A::Vector{T},B::Vector{T})
>>>   n = size(A)[1]
>>>   if n == size(B)[1]
>>>     for i=1:n
>>>       @inbounds A[i] *= B[i]
>>>     end
>>>   end
>>>   return A
>>> end
>>> 
>>> Bench mark results (JuliaBox, A = randn(300000) :
>>> 
>>> function                          CPU (s)     GC (%)  ALLOCATION (bytes)  
>>> CPU (x)     
>>> dot(A,B)                          1.58e-04    0.00    16                  
>>> 1.0         
>>> xpy!(A,B)                         2.31e-04    0.00    80                  
>>> 1.5         
>>> NumericExtensions.multiply!(P,Q)  3.60e-04    0.00    80                  
>>> 2.3         
>>> xpy!(A,B) - no @inbounds check    4.36e-04    0.00    80                  
>>> 2.8         
>>> P.*Q                              2.52e-03    50.36   2400512             
>>> 16.0        
>>> ############################################################
>> 

Reply via email to