The manual suggests that pmap(f, lst) will dynamically "feed" elements of 
lst to the function f as each worker completes its previous assignment, and 
in my read of the code for pmap, this is indeed what it does.

However, I have found that, in practice, many of the workers that I spin up 
for pmap tasks are idle for the last, say, half of the total time needed to 
complete the task.  In my pmap usage, it is the case that the complexity of 
the workload varies across elements of lst, so that some elements should 
take a long time to compute (say, 15 minutes on a core of my machine) and 
others a short time (less than 1 minute).  Knowing about this heterogeneity 
and observing this pattern of idle workers after about half of the work is 
done would normally lead me to think that pmap is scheduling workers ahead 
of time, not dynamically.  Some workers will get "lucky" and have easier 
than average workload, and others are unlucky and have harder workload.  At 
the end of the calculation, only the unlucky workers are still working. 
 However, this isn't what pmap is doing, so I'm kinda confused. 

Am I crazy?  The documentation for pmap says that it is scheduling tasks 
dynamically and I am pre-randomizing the order of work in lst so that 
worker 1 doesn't get easier tasks, in expectation, than worker N.  Or is it 
more likely that I've got a bug somewhere?


Reply via email to