I have no experience with it, but it looks like you could also just do:

Ans1 = SharedArray(Float64, (limit, int64(limit/2))
Ans2 = SharedArray(Float64, (limit, int64(limit/2))

@parallel for sample=1:samples, i=1:limit, j=1:int64(limit/2)
   Sx = S[i, sample]
   Sy = S[j, sample]
   Sxy = S[i+j, sample]
   ...

  Ans1[i,j] = Aix * Bix / samples / samples
  Ans2[i,j] = Cix / samples
end

return (Ans1, Ans2)


On Mon, Feb 3, 2014 at 8:48 AM, David Salamon <[email protected]> wrote:

> Also S[:,1] is allocating. it should look something like:
>
> for sample=1:samples, i=1:limit, j=1:int64(limit/2)
>    Sx = S[i, sample]
>    Sy = S[j, sample]
>    Sxy = S[i+j, sample]
>    ...
> end
>
>
> On Mon, Feb 3, 2014 at 8:45 AM, David Salamon <[email protected]> wrote:
>
>> You're not out of the no-slicing woods yet. Looks like you can get rid of
>> `mx` and `my`
>>
>> for i=1:limit, j=1:int64(limit/2)
>> end
>>
>>
>>
>> As far as parallelizing, you could define:
>> three_tup_add(a, b, c) = (a[1] + b[1] + c[1], a[2] + b[2] + c[2], a[3] +
>> b[3] + c[3])
>>
>> and then do a @parallel (three_tup_add) over your sample index?
>>
>> for that matter, why not compute the two parts of the answer directly
>> rather than going via A, B, and C?
>>
>>
>>
>>
>>
>>
>> On Mon, Feb 3, 2014 at 8:11 AM, Alex C <[email protected]> wrote:
>>
>>> Thanks. I've re-written the function to minimize the amount of copying
>>> (i.e. slicing) that is required. But now, I'm befuddled as to how to
>>> parallelize this function using Julia. Any suggestions?
>>>
>>> Alex
>>>
>>> function expensive_hat(S::Array{Complex{Float64},2}, mx::Array{Int64,2},
>>> my::Array{Int64,2})
>>>
>>>     samples = 64
>>>         A = zeros(size(mx));
>>>     B = zeros(size(mx));
>>>     C = zeros(size(mx));
>>>
>>>     for i = 1:samples
>>>         Si = S[:,i];
>>>         Sx = Si[mx];
>>>         Sy = Si[my];
>>>         Sxy = Si[mx+my];
>>>         Sxyc = conj(Sxy);
>>>
>>>                 A +=  abs2(Sy .* Sx);
>>>         B += abs2(sqrt(Sxyc .* Sxy));
>>>         C += Sxyc .* Sy .* Sx;
>>>     end
>>>
>>>         ans = (A .* B ./ samples ./ samples, C./samples)
>>>     return ans
>>>
>>> end
>>>
>>> data = rand(24000,64);
>>> limit = 2000;
>>>
>>> ix = int64([1:limit/2]);
>>> iy = ix[1:end/2];
>>> mg = zeros(Int64,length(iy),length(ix));
>>> mx = broadcast(+,ix',mg);
>>> my = broadcast(+,iy,mg);
>>> S = rfft(data,1)./24000;
>>>
>>> @time (AB, C) = expensive_hat(S,mx,my);
>>>
>>>
>>
>>
>

Reply via email to