Re: [julia-users] Broadcasting variables

Amit Murthy Sun, 23 Nov 2014 00:04:42 -0800

I mentioned  localize_vars() since it is one of the differences between the
implementations of @everywhere and @spawnat. But, there is something else
also going on that I don't understand.


On Sun, Nov 23, 2014 at 12:13 PM, Madeleine Udell <[email protected]
> wrote:

> Yes, I read the code, but I'm not sure I understand what the let statement
> is doing. It's trying to redefine the scope of the variable, or create a
> new variable with the same value but over a different scope? How does the
> let statement interact with the namespaces of the various processes?
>
> On Sat, Nov 22, 2014 at 10:30 PM, Amit Murthy <[email protected]>
> wrote:
>
>> From the description of Base.localize_vars - 'wrap an expression in "let
>> a=a,b=b,..." for each var it references'
>>
>> Though that does not seem to the only(?) issue here....
>>
>> On Sun, Nov 23, 2014 at 11:52 AM, Madeleine Udell <
>> [email protected]> wrote:
>>
>>> Thanks! This is extremely helpful.
>>>
>>> Can you tell me more about what localize_vars does?
>>>
>>> On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy <[email protected]>
>>> wrote:
>>>
>>>> This works:
>>>>
>>>> function doparallelstuff(m = 10, n = 20)
>>>>     # initialize variables
>>>>     localX = Base.shmem_rand(m; pids=procs())
>>>>     localY = Base.shmem_rand(n; pids=procs())
>>>>     localf = [x->i+sum(x) for i=1:m]
>>>>     localg = [x->i+sum(x) for i=1:n]
>>>>
>>>>     # broadcast variables to all worker processes
>>>>     @sync begin
>>>>         for i in procs(localX)
>>>>             remotecall(i, x->(global X; X=x; nothing), localX)
>>>>             remotecall(i, x->(global Y; Y=x; nothing), localY)
>>>>             remotecall(i, x->(global f; f=x; nothing), localf)
>>>>             remotecall(i, x->(global g; g=x; nothing), localg)
>>>>         end
>>>>     end
>>>>
>>>>     # compute
>>>>     for iteration=1:1
>>>>         @everywhere for i=localindexes(X)
>>>>             X[i] = f[i](Y)
>>>>         end
>>>>         @everywhere for j=localindexes(Y)
>>>>             Y[j] = g[j](X)
>>>>         end
>>>>     end
>>>> end
>>>>
>>>> doparallelstuff()
>>>>
>>>> Though I would have expected broadcast of variables to be possible with
>>>> just
>>>> @everywhere X=localX
>>>> and so on ....
>>>>
>>>>
>>>> Looks like @everywhere does not call localize_vars.  I don't know if
>>>> this is by design or just an oversight. I would have expected it to do so.
>>>> Will file an issue on github.
>>>>
>>>>
>>>>
>>>> On Sun, Nov 23, 2014 at 8:24 AM, Madeleine Udell <
>>>> [email protected]> wrote:
>>>>
>>>>> The code block I posted before works, but throws an error when
>>>>> embedded in a function: "ERROR: X not defined" (in first line of
>>>>> @parallel). Why am I getting this error when I'm *assigning to* X?
>>>>>
>>>>> function doparallelstuff(m = 10, n = 20)
>>>>>     # initialize variables
>>>>>     localX = Base.shmem_rand(m)
>>>>>     localY = Base.shmem_rand(n)
>>>>>     localf = [x->i+sum(x) for i=1:m]
>>>>>     localg = [x->i+sum(x) for i=1:n]
>>>>>
>>>>>     # broadcast variables to all worker processes
>>>>>     @parallel for i=workers()
>>>>>         global X = localX
>>>>>         global Y = localY
>>>>>         global f = localf
>>>>>         global g = localg
>>>>>     end
>>>>>     # give variables same name on master
>>>>>     X,Y,f,g = localX,localY,localf,localg
>>>>>
>>>>>     # compute
>>>>>     for iteration=1:1
>>>>>         @everywhere for i=localindexes(X)
>>>>>             X[i] = f[i](Y)
>>>>>         end
>>>>>         @everywhere for j=localindexes(Y)
>>>>>             Y[j] = g[j](X)
>>>>>         end
>>>>>     end
>>>>> end
>>>>>
>>>>> doparallelstuff()
>>>>>
>>>>> On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> My experiments with parallelism also occur in focused blocks; I think
>>>>>> that's a sign that it's not yet as user friendly as it could be.
>>>>>>
>>>>>> Here's a solution to the problem I posed that's simple to use:
>>>>>> @parallel + global can be used to broadcast a variable, while @everywhere
>>>>>> can be used to do a computation on local data (ie, without resending the
>>>>>> data). I'm not sure how to do the variable renaming programmatically,
>>>>>> though.
>>>>>>
>>>>>> # initialize variables
>>>>>> m,n = 10,20
>>>>>> localX = Base.shmem_rand(m)
>>>>>> localY = Base.shmem_rand(n)
>>>>>> localf = [x->i+sum(x) for i=1:m]
>>>>>> localg = [x->i+sum(x) for i=1:n]
>>>>>>
>>>>>> # broadcast variables to all worker processes
>>>>>> @parallel for i=workers()
>>>>>>     global X = localX
>>>>>>     global Y = localY
>>>>>>     global f = localf
>>>>>>     global g = localg
>>>>>> end
>>>>>> # give variables same name on master
>>>>>> X,Y,f,g = localX,localY,localf,localg
>>>>>>
>>>>>> # compute
>>>>>> for iteration=1:10
>>>>>>     @everywhere for i=localindexes(X)
>>>>>>         X[i] = f[i](Y)
>>>>>>     end
>>>>>>     @everywhere for j=localindexes(Y)
>>>>>>         Y[j] = g[j](X)
>>>>>>     end
>>>>>> end
>>>>>>
>>>>>> On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> My experiments with parallelism tend to occur in focused blocks, and
>>>>>>> I haven't
>>>>>>> done it in quite a while. So I doubt I can help much. But in general
>>>>>>> I suspect
>>>>>>> you're encountering these problems because much of the IPC goes
>>>>>>> through
>>>>>>> thunks, and so a lot of stuff gets reclaimed when execution is done.
>>>>>>>
>>>>>>> If I were experimenting, I'd start by trying to create RemoteRef()s
>>>>>>> and put!
>>>>>>> ()ing my variables into them. Then perhaps you might be able to
>>>>>>> fetch them
>>>>>>> from other processes. Not sure that will work, but it seems to be
>>>>>>> worth a try.
>>>>>>>
>>>>>>> HTH,
>>>>>>> --Tim
>>>>>>>
>>>>>>> On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
>>>>>>> > I'm trying to use parallelism in julia for a task with a structure
>>>>>>> that I
>>>>>>> > think is quite pervasive. It looks like this:
>>>>>>> >
>>>>>>> > # broadcast lists of functions f and g to all processes so they're
>>>>>>> > available everywhere
>>>>>>> > # create shared arrays X,Y on all processes so they're available
>>>>>>> everywhere
>>>>>>> > for iteration=1:1000
>>>>>>> > @parallel for i=1:size(X)
>>>>>>> > X[i] = f[i](Y)
>>>>>>> > end
>>>>>>> > @parallel for j=1:size(Y)
>>>>>>> > Y[j] = g[j](X)
>>>>>>> > end
>>>>>>> > end
>>>>>>> >
>>>>>>> > I'm having trouble making this work, and I'm not sure where to dig
>>>>>>> around
>>>>>>> > to find a solution. Here are the difficulties I've encountered:
>>>>>>> >
>>>>>>> > * @parallel doesn't allow me to create persistent variables on each
>>>>>>> > process; ie, the following results in an error.
>>>>>>> >
>>>>>>> >         s = Base.shmem_rand(12,3)
>>>>>>> > @parallel for i=1:nprocs() m,n = size(s) end
>>>>>>> > @parallel for i=1:nprocs() println(m) end
>>>>>>> >
>>>>>>> > * @everywhere does allow me to create persistent variables on each
>>>>>>> process,
>>>>>>> > but doesn't send any data at all, including the variables I need
>>>>>>> in order
>>>>>>> > to define new variables. Eg the following is an error: s is a
>>>>>>> shared array,
>>>>>>> > but the variable (ie pointer to) s is apparently not shared.
>>>>>>> > s = Base.shmem_rand(12,3)
>>>>>>> > @everywhere m,n = size(s)
>>>>>>> >
>>>>>>> > Here are the kinds of questions I'd like to see protocode for:
>>>>>>> > * How can I broadcast a variable so that it is available and
>>>>>>> persistent on
>>>>>>> > every process?
>>>>>>> > * How can I create a reference to the same shared array "s" that is
>>>>>>> > accessible from every process?
>>>>>>> > * How can I send a command to be performed in parallel, specifying
>>>>>>> which
>>>>>>> > variables should be sent to the relevant processes and which
>>>>>>> should be
>>>>>>> > looked up in the local namespace?
>>>>>>> >
>>>>>>> > Note that everything I ask above is not specific to shared arrays;
>>>>>>> the same
>>>>>>> > constructs would also be extremely useful in the distributed case.
>>>>>>> >
>>>>>>> > ----------------------
>>>>>>> >
>>>>>>> > An interesting partial solution is the following:
>>>>>>> > funcs! = Function[x->x[:] = x+k for k=1:3]
>>>>>>> > d = drand(3,12)
>>>>>>> > let funcs! = funcs!
>>>>>>> >   @sync @parallel for k in 1:3
>>>>>>> >     funcs![myid()-1](localpart(d))
>>>>>>> >   end
>>>>>>> > end
>>>>>>> >
>>>>>>> > Here, I'm not sure why the let statement is necessary to send
>>>>>>> funcs!, since
>>>>>>> > d is sent automatically.
>>>>>>> >
>>>>>>> > ---------------------
>>>>>>> >
>>>>>>> > Thanks!
>>>>>>> > Madeleine
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Madeleine Udell
>>>>>> PhD Candidate in Computational and Mathematical Engineering
>>>>>> Stanford University
>>>>>> www.stanford.edu/~udell
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Madeleine Udell
>>>>> PhD Candidate in Computational and Mathematical Engineering
>>>>> Stanford University
>>>>> www.stanford.edu/~udell
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Madeleine Udell
>>> PhD Candidate in Computational and Mathematical Engineering
>>> Stanford University
>>> www.stanford.edu/~udell
>>>
>>
>>
>
>
> --
> Madeleine Udell
> PhD Candidate in Computational and Mathematical Engineering
> Stanford University
> www.stanford.edu/~udell
>

Re: [julia-users] Broadcasting variables

Reply via email to