Hi John --
> Just to make sure: the language spec says that iterator and loop body
> are executed in interleaved manner, so does that mean the these() method
> (leader iterator) in BlockDom returns only after the loop bodies are
> actually executed? Ie. timing body of coforall loop in these() will
> measure how long forall loop takes to execute on each locale.
The language spec has long been behind w.r.t. the implementation of forall
loops, so feel free not to hesitate to send such questions to the mailing
list here.
In the current copy of master, forall loops get translated into either:
a) standalone parallel iterators (a new feature in 1.11), or
b) leader-follower iterators
Case (a) is used when a standalone parallel iterator is available and the
forall loop is not a zippered iteration. Case (b) is used for zippered
forall loops, or in the case that a standalone parallel iterator is not
available. We're currently in the process of implementing standalone
iterators for most of our domain maps (that's how new they are).
In either case, for typical loop structures, you should think of the
result of the forall as being rewritten into something like:
coforall loc in targetLocales { // create per-locale task
on loc do { // move task to locale
coforall tid in ... { // create local tasks
for i in ... { // do task-local work
...body of forall loop...
}
}
}
}
where the coforalls and ons are typically in the leader/standalone
iterator, the inner for is in the follower/standalone iterator, and the
body is the body.
The semantics of a coforall guarantees that the task which entered it
won't complete until all of their iterations have completed, thus, it's
correct that the leader/standalone iterator will not return until after
all the loop bodies are executed (and this property is required in order
to implement forall semantics properly).
> In one of the previous post Brad described a cut distribution that
> existed in ZPL. Do you know of any papers about that, or papers about
> any other efficient ways of writing efficient, variable sized
> distributions? My implementation isn't quite as efficient as I'd like,
> though I'll see what kind of improvement some caching will bring...
I don't know that there was anything particularly efficient about the cut
distribution in ZPL that you're missing here. Chapel, in its current
form, is known to result in suboptimal performance in many cases,
particularly distributed memory runs (see $CHPL_HOME/PERFORMANCE). For
what you're undertaking, I think the big question would be whether your
distribution is significantly underperfoming Block when you use it to
distribute things evenly. If so, that suggests that there's more that
could be done to optimize it; if not, it suggests you're running into the
Chapel status quo.
In particular, ZPL was much more competitive with MPI for many interesting
benchmarks (like the NPB) than Chapel has ever been; but that
competitiveness came with a lot of restrictions -- no user-defined
distributions, no task-parallelism or nested parallelism, no OOP, few
modern language conveniences. You can think of Chapel as being on a path
to reproduce the performance successes of ZPL in a much more
general/extensible language so that it can be practically adopted rather
than an academic curiosity. Each of the last several releases have
contained significant improvements to performance, and we expect that
trend to continue.
The main paper about cut distributions in ZPL that I'm aware of was Steve
Deitz's thesis, available here:
http://research.cs.washington.edu/zpl/papers/data/Deitz05Thesis.pdf
though from memory, I'm skeptical that it will contain any silver bullets
that would help with optimizing a similar distribution in Chapel.
-Brad
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users