Hi,
I'm struggling to understand the scheduling subsection (in the parallel
computation section) of the docs; specifically how the pmap function as
shown there works. I've copy-pasted the code I used below my questions.
Question 1) Is the purpose of the @sync block just to wait for all the
processes to finish before returning the variable results i.e. so that no
incomplete references are returned?
Question 2) If you run the code below (I added 2 other processes) you'll
see the following output:
In @async, p = 2, idx = 1
In @async, p = 3, idx = 2
In @async, p = 3, idx = 3
In @async, p = 2, idx = 4
In @async, p = 3, idx = 5
How does p go from 2 to 3 to 2 etc.? I thought a for loop runs through its
iterations sequentially? Clearly @async does something exotic but I don't
get it?
Any help appreciated!
The code:
M = {rand(800,800), rand(800,800), rand(600,600), rand(600,600),
rand(500,500)}
function pmap2(f, lst)
np = nprocs() # determine the number of processes available
n = length(lst)
results = cell(n)
i = 1
# function to produce the next work item from the queue.
# in this case it's just an index.
nextidx() = (idx=i; i+=1; idx)
@sync begin
for p=1:np
if p != myid() || np == 1
@async begin
while true
idx = nextidx()
if idx > n
break
end
println("In @async, p = ",p,", idx = ", idx)
results[idx] = remotecall_fetch(p, f,
lst[idx])
end
end
end
end
end
results
end
a = pmap2(svd, M)
1+1 # this is just so that a is not printed by default...