> On Tue, Oct 11, 2011 at 10:55 AM, j-g-faustus <johannes.fries...@gmail.com> 
> wrote:
>> I expect it would be possible to nest it (possible as in "no exceptions or 
>> deadlocks"), but I can't see any scenario where you would want to - you 
>> would get an exponentially increasing number of threads. If 48 cores each 
>> start 48 threads, each of those threads start another 48 etc., it doesn't 
>> take long before you have enough threads to bog down even the most powerful 
>> server.
>> 
>> But what would be the purpose of a nested "run this on all cores" construct? 
>> You are already running on all cores, there are no spare resources, so in 
>> terms of CPU time I can't see how it would differ from merely having the 
>> pmapped function use a plain same-thread map? 

On Oct 11, 2011, at 2:07 PM, Andy Fingerhut wrote:
> One benefit would be convenience of enabling parallelism on nested data 
> structures.  One function at the top level could use parallelism, and the 
> pieces, perhaps handled by separate functions, and perhaps nested several 
> levels deep in function calls, could also use parallelism.


Or consider the following scenario (which is exactly what I was doing :-): You 
want to produce the next generation of programs in an evolutionary computation 
system from n parents, each of which may produce m offspring, with n much 
larger than your number of cores (c). The runtimes for offspring production may 
vary widely. So you do something like (pmapall reproduce population) to 
maximize processor utilization among the parents, but late in the process, when 
the number of parents who are still busy making offspring is less than c, your 
cores start to go idle. Suppose you have just a few remaining parents who are 
much slower than the others, and each has to chug through m independent birth 
processes with cores sitting idle. If you could nest calls to pmapall then 
those slowpokes would be farming some of their m births out to the available 
cores.

In this situation, at least, I don't care about the order in which the (* n m) 
independent tasks are completed. I just want them all done, with as little 
wasted CPU capacity as possible. And I don't want to somehow factor out all of 
the births into one long queue that I pass to a single call to pmapall. The 
natural decomposition of the problem is to say something like (pmapall 
reproduce population) at the population level, something like (pmapall (fn [_] 
(make-baby parent)) (range litter-size)) at the parent level, and ask the 
platform to make the calls as aggressively as possible, reusing cores for 
pending calls whenever possible.

Calls to pmap can be nested and I *think* that they will actually do some of 
this, if the sequencing of fast vs slow things is fortuitous. For example, I 
*think* that in the some of the situations that have been described here, where 
some cores are idle waiting for a slow, early computation in a sequence to 
complete, a pmap from within the slow, early computation will use some of those 
available cores. I'm not certain of this, however! And in any event it's not 
ideal for the reasons that have been raised.

 -Lee

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to