Some more weirdness

Starting with julia -p 8

A=Base.shmem_fill(1, (1000,1000))

Using 2 workers:
for i in 1:100
         t1 = time(); p=2+(i%2); remotecall_fetch(p, x->1, A); t2=time();
println("@ $p ", int((t2-t1) * 1000))
end

prints

...
@ 3 8
@ 2 32
@ 3 8
@ 2 32
@ 3 8
@ 2 32
@ 3 8
@ 2 32


Notice that pid 2 always takes 32 milliseconds while pid 3 always takes 8



With 4 workers:

for i in 1:100
         t1 = time(); p=2+(i%4); remotecall_fetch(p, x->1, A); t2=time();
println("@ $p ", int((t2-t1) * 1000))
end

...
@ 2 31
@ 3 4
@ 4 4
@ 5 1
@ 2 31
@ 3 4
@ 4 4
@ 5 1
@ 2 31
@ 3 4
@ 4 4
@ 5 1
@ 2 31


Now pid 2 always takes 31 millisecs, pids 3&4, 4 and pid 5 1 millisecond

With 8 workers:

for i in 1:100
         t1 = time(); p=2+(i%8); remotecall_fetch(p, x->1, A); t2=time();
println("@ $p ", int((t2-t1) * 1000))
end

....
@ 2 20
@ 3 4
@ 4 1
@ 5 3
@ 6 4
@ 7 1
@ 8 2
@ 9 4
@ 2 20
@ 3 4
@ 4 1
@ 5 3
@ 6 4
@ 7 1
@ 8 2
@ 9 4
@ 2 20
@ 3 4
@ 4 1
@ 5 3
@ 6 4
@ 7 1
@ 8 3
@ 9 4
@ 2 20
@ 3 4
@ 4 1
@ 5 3
@ 6 4


pid 2 is always 20 milliseconds while the rest are pretty consistent too.

Any explanations?







On Thu, Mar 27, 2014 at 5:24 PM, Amit Murthy <[email protected]> wrote:

> I think the code does not do what you want.
>
> In the non-shared case you are sending a 10^6 integer array over the
> network 1000 times and summing it as many times. Most of the time is the
> network traffic time. Reduce 'n' to say 10, and you will what I mean
>
> In the shared case you are not sending the array over the network but
> still summing the entire array 1000 times. Some of the remotecall_fetch
> calls seems to be taking 40 milli seconds extra time which adds to the
> total.
>
> shared time of 6 seconds being less than the 15 seconds for non-shared
> seems to be just incidental.
>
> I don't yet have an explanation for the extra 40 millseconds per
> remotecall_fetch (for some calls only) in the shared case.
>
>
>
>
>
>
> On Thu, Mar 27, 2014 at 2:50 PM, Mikael Simberg <[email protected]>wrote:
>
>> Hi,
>> I'm having some trouble figuring out exactly how I'm supposed to use
>> SharedArrays - I might just be misunderstanding them or else something
>> odd is happening with them.
>>
>> I'm trying to do some parallel computing which looks a bit like this
>> test case:
>>
>> function createdata(shared)
>>     const n = 1000
>>     if shared
>>         A = SharedArray(Uint, (n, n))
>>     else
>>         A = Array(Uint, (n, n))
>>     end
>>     for i = 1:n, j = 1:n
>>         A[i, j] = rand(Uint)
>>     end
>>
>>     return n, A
>> end
>>
>> function mainfunction(r; shared = false)
>>     n, A = createdata(shared)
>>
>>     i = 1
>>     nextidx() = (idx = i; i += 1; idx)
>>
>>     @sync begin
>>         for p in workers()
>>             @async begin
>>                 while true
>>                     idx = nextidx()
>>                     if idx > r
>>                         break
>>                     end
>>                     found, s = remotecall_fetch(p, parfunction, n, A)
>>                 end
>>             end
>>         end
>>     end
>> end
>>
>> function parfunction(n::Int, A::Array{Uint, 2})
>>     # possibly do some other computation here independent of shared
>>     arrays
>>     s = sum(A)
>>     return false, s
>> end
>>
>> function parfunction(n::Int, A::SharedArray{Uint, 2})
>>     s = sum(A)
>>     return false, s
>> end
>>
>> If I then start julia with e.g. two worker processes, so julia -p 2, the
>> following happens:
>>
>> julia> require("testpar.jl")
>>
>> julia> @time mainfunction(1000, shared = false)
>> elapsed time: 15.717117365 seconds (8448701068 bytes allocated)
>>
>> julia> @time mainfunction(1000, shared = true)
>> elapsed time: 6.068758627 seconds (56713996 bytes allocated)
>>
>> julia> rmprocs([2, 3])
>> :ok
>>
>> julia> @time mainfunction(1000, shared = false)
>> elapsed time: 0.717638344 seconds (40357664 bytes allocated)
>>
>> julia> @time mainfunction(1000, shared = true)
>> elapsed time: 0.702174085 seconds (32680628 bytes allocated)
>>
>> So, with a normal array it's slow as expected, and it is faster with the
>> shared array, but what seems to happen is that with the normal array cpu
>> usage is 100 % on two cores but with the shared array cpu usage spikes
>> for a fraction of a second and then for the remaining nearly 6 seconds
>> it's at around 10 %. Can anyone reproduce this? Am I just doing
>> something wrong with shared arrays.
>>
>> Slightly related note: is there now a way to create a random shared
>> array? https://github.com/JuliaLang/julia/pull/4939 and the latest docs
>> don't mention this.
>>
>
>

Reply via email to