Also would you mind giving a (very) brief explanation about the caching
system used? Or point me to a relevant paper.
I meant the one that's present in BlockDist.chpl file, called RAD cache.
Or are there more than one in there...? Especially information related
to stencils and ghost cells would be appreciated.
There currently isn't support in BlockDist itself for stencil-style
caching at present. There's a start at how this might be added in
[test/release/]examples/benchmarks/miniMD/helpers/StencilDist.chpl, which
is itself a clone of BlockDist, extended to support halos/ghost cells
("fluff"). The main downside of using StencilDist at present is that the
user is responsible for explicitly requesting the updates to the caches
(rather than having the compiler automatically insert those calls in order
to support the "plug-and-play" vision of domain maps with no code
changes). The miniMD code in that directory (and its parent) are the best
illustrations of this feature at present.
The RAD cache is a different type of cache. It's designed to cache the
meta-data in the array class itself ("the dope vector" essentially) which
will permit one locale to index into another's memory in a single message.
I.e., if you own index (i,j) of a distributed array, I can either:
(a) remotely access all the meta-data from your descriptor that I need to
do the indexing calculation to determine the element's address myself
(requires lots of gets);
(b) ship you (i,j), have you do all the indexing and return a reference
to the array element (requires an active message);
(c) cache all the meta-data from your descriptor such that, once I have
it, I can do the indexing locally (requires gets but only to populate
the cache; after that, remote accesses should be free).
My high-level understanding of what's required to enable this optimization
for a given distribution is to identify which parts of the array
descriptor need to be cached to do this type of remote access. For the
changes to Block that we've discussed, I would guess no changes would be
required. (I also typically suggest that people not worry about this
optimization until they get things up and running).
The best description of this that I'm aware of is in the commit messages
that added the capability. For example:
https://github.com/bradcray/chapel/commit/633353f
https://github.com/bradcray/chapel/commit/fa9b841
though there may be other documentation. I can ask around if desired and
nobody else speaks up on this thread.
-Brad
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users