Re: [julia-users] SharedArray oddities

Mikael Simberg Thu, 27 Mar 2014 06:43:37 -0700

Yes, you're at least half-right about it not doing quite what I want.
Or let's say I was expecting the majority of the overhead to come from
having to send the array over to each process, but what I wasn't
expecting was that getting a boolean and an integer back would take so
much time (and thus I was expecting using a SharedArray would have been
at least comparable to keeping everything local). Indeed, if I just do
a remotecall (i.e. without the fetch) it is faster with multiple
processes which is what I was expecting.




What I essentially want to do in the end is that the parfunction() is
successful with some probability and then I want to return some object
from the calculations there, but in general I will not want to fetch
anything. What would be the "correct" way to do that? If I have the
following code:



function mainfunction(r)



    const n = 1000

    A = SharedArray(Uint, (n, n))

    for i = 1:n, j = 1:n

        A[i, j] = rand(Uint)

    end

    s = SharedArray(Uint, (1))



    i = 1

    nextidx() = (idx = i; i += 1; idx)



    println(s)

    @sync begin

        for p in workers()

            @async begin

                while true

                    idx = nextidx()

                    if idx > r

                        break

                    end

                    remotecall(p, parfunction, A, s)

                end

            end

        end

    end

    println(s)

end



function parfunction(A::SharedArray{Uint, 2}, s::SharedArray{Uint, 1})

    d = sum(A)

    if rand(0:1000) == 0

        println("success")

        s[1] = d

    end

end



and run

julia -p 2

julia> reload("testpar.jl")

julia> @time mainfunction(5000)



I get ERROR: SharedArray cannot be used on a non-participating process,
although s should according to my logic be available on all processes
(I'm assuming it's s that's causing it because it's fine if I remove
all traces of s).



On Thu, Mar 27, 2014, at 4:54, Amit Murthy wrote:

I think the code does not do what you want.

In the non-shared case you are sending a 10^6 integer array over the
network 1000 times and summing it as many times. Most of the time is
the network traffic time. Reduce 'n' to say 10, and you will what I
mean

In the shared case you are not sending the array over the network but
still summing the entire array 1000 times. Some of the remotecall_fetch
calls seems to be taking 40 milli seconds extra time which adds to the
total.

shared time of 6 seconds being less than the 15 seconds for non-shared
seems to be just incidental.

I don't yet have an explanation for the extra 40 millseconds per
remotecall_fetch (for some calls only) in the shared case.






On Thu, Mar 27, 2014 at 2:50 PM, Mikael Simberg
<[1][email protected]> wrote:

Hi,

I'm having some trouble figuring out exactly how I'm supposed to use

SharedArrays - I might just be misunderstanding them or else something

odd is happening with them.



I'm trying to do some parallel computing which looks a bit like this

test case:



function createdata(shared)

    const n = 1000

    if shared

        A = SharedArray(Uint, (n, n))

    else

        A = Array(Uint, (n, n))

    end

    for i = 1:n, j = 1:n

        A[i, j] = rand(Uint)

    end



    return n, A

end



function mainfunction(r; shared = false)

    n, A = createdata(shared)



    i = 1

    nextidx() = (idx = i; i += 1; idx)



    @sync begin

        for p in workers()

            @async begin

                while true

                    idx = nextidx()

                    if idx > r

                        break

                    end

                    found, s = remotecall_fetch(p, parfunction, n, A)

                end

            end

        end

    end

end



function parfunction(n::Int, A::Array{Uint, 2})

    # possibly do some other computation here independent of shared

    arrays

    s = sum(A)

    return false, s

end



function parfunction(n::Int, A::SharedArray{Uint, 2})

    s = sum(A)

    return false, s

end



If I then start julia with e.g. two worker processes, so julia -p 2,
the

following happens:



julia> require("testpar.jl")



julia> @time mainfunction(1000, shared = false)

elapsed time: 15.717117365 seconds (8448701068 bytes allocated)



julia> @time mainfunction(1000, shared = true)

elapsed time: 6.068758627 seconds (56713996 bytes allocated)



julia> rmprocs([2, 3])

:ok



julia> @time mainfunction(1000, shared = false)

elapsed time: 0.717638344 seconds (40357664 bytes allocated)



julia> @time mainfunction(1000, shared = true)

elapsed time: 0.702174085 seconds (32680628 bytes allocated)



So, with a normal array it's slow as expected, and it is faster with
the

shared array, but what seems to happen is that with the normal array
cpu

usage is 100 % on two cores but with the shared array cpu usage spikes

for a fraction of a second and then for the remaining nearly 6 seconds

it's at around 10 %. Can anyone reproduce this? Am I just doing

something wrong with shared arrays.



Slightly related note: is there now a way to create a random shared

array? [2]https://github.com/JuliaLang/julia/pull/4939 and the latest
docs

don't mention this.

References

1. mailto:[email protected]
2. https://github.com/JuliaLang/julia/pull/4939

Re: [julia-users] SharedArray oddities

Reply via email to