Hi Marcin --
I am wondering if there is a way to change a mapping of a distributed
array.
Good question. Feel free to direct questions like this that might be of
general interest to Chapel programmers to Stack Overflow if you use it
(tends to be a more searchable, permanent place to look for answers to
such questions).
High-level answer: This isn't quite what you were asking, but there isn't
a way to change the domain of an array nor the domain map of a domain
(currently at least, nor a plan to support it in the future). These
decisions are effectively part of the static type of the variable and so
would be challenging to change. The rationale for this decision is that
they have a big impact on the code generated for an array.
Getting closer to what you're asking: a given domain map can potentially
be modified to affect the implementation of its doamins and their arrays,
including their distributions -- if the domain map author has provided a
means to do so. Most of the rest of this mail will describe what is
possible within our current domain maps, but note that one could always
modify the domain maps further, or write a new domain map, to support
things that our domain maps don't happen to support currently.
Diving in a bit further, in the example you provide, the issue is that the
Block distribution doesn't (currently) have a way of redefining its
bounding box / distribution as you'd ideally like to:
var domC = {0..0} dmapped Block({0..0});
...
domC = {0..i}; // dmapped Block({0..i}); // needed, or halt on assignment
So once the bounding box defining its distribution is set to {0..0},
there's no way to update it. The reason for this is partially due to
laziness on the part of the domain map author (in this case, probably me),
and partially fear of leading users to do something expensive without
being aware of it. Specifically, if you think about the communication
required to naively change the bounding box from 0..0 to 0..1 to 0..2 to
0..n one element at a time across a large number of locales, there'd be
lots of little ripples of data across the locale set with each change.
(That said, the Block distribution probably really _ought_ to define a
method that redefines the bounding box, regardless of the cost, where the
method's documentation could warn against using it too frequently for
performance reasons).
I understand why is this, but I am looking for a way to add values to a
distributed array and then “rebalance” the distribution at some point.
So there are 2 questions I have:
1. Is there a way to do this with arrays at all without somehow
creating a whole new array?
2. Is there any better approach for this?
Answering both questions at once, my mind goes to using a different
distribution that's more amenable to load balancing as new indices are
added to it. For example, rewriting your example to use the Cyclic
distribution:
use CyclicDist;
var domC = {0..0} dmapped Cyclic(startIdx=0);
the output shows up like this:
Initially, C is: 0
C(0) assigned, C is: 0. domC is {0..0}
C(1) assigned, C is: 0 1. domC is {0..1}
C(2) assigned, C is: 0 1 0. domC is {0..2}
C(3) assigned, C is: 0 1 0 1. domC is {0..3}
C(4) assigned, C is: 0 1 0 1 0. domC is {0..4}
C(5) assigned, C is: 0 1 0 1 0 1. domC is {0..5}
So, by nature, Cyclic is better suited to distributing new indices across
the locales without any need to change the definition or type of the
domain map.
If your data structure / algorithm is one that benefits from having
consecutive blocks of _logical_ indices mapped to the same locale, then
the BlockCyclic distribution might be a nice choice in that it would
combine some of the index locality of the Block distribution with the
ability to grow arbitrarily without redefining the domain map, as in the
Cyclic distribution. But trying it within your program, I'm finding that
Block-Cylic isn't mature enough to support your program yet, for lame
reasons (of these three domain maps, it's the one that's seen the least
amount of development + optimization effort and apparently re-assigning a
domain dmapped with BlockCyclic() has never been implemented). Feel free
to file a feature request against this if it it's of interest to you.
But let me also emphasize that _index locality_, not spatial locality, is
the main difference between Cyclic and Block-Cyclic. Both will store the
local elements of a 1D array in consecutive memory locations, so they
should both have similar spatial locality if all you were doing was
iterating over all of the elements of an array (say).
If index locality doesn't matter to you at all, another option to explore
would be to use a distributed associative domain. This results in even
less spatial locality, and the load balancing would only be as good as its
hash function. Distributed associative domains are a feature that have
never made it onto master, but have been "almost done" forever, so this
would be another place that'd be fair to file a feature request if it'd be
useful/preferable to you than the 1D rectangular domains/arrays you're
currently using.
Circling back around: even if Block() were to support a way to dynamically
change its bounding box, I'm not convinced that it's what you'd want to
use for your case given that you're potentially adding vertices
incrementally, which isn't a cheap operation for a Block distribution.
Note that even if we implemented a way to change the bounding box for
Block(), we'd likely create a whole new array under the covers, assign
between the arrays and throw the original away -- so you'd get the
convenience of being able to change the bounding box, but wouldn't get
anything like in-place reallocation and minimal shifting of data between
locales without a lot of work (i.e., one _could_ implement this with
effort, I'm just not convinced the benefit would be worth the effort, so
wouldn't sign up for it myself).
With arrays with Block distribution, everything outside of the bounding
box seems to go to the last locale.
This isn't quite right. It's actually that the n-dimensional bounding box
specifies a partitioning on the n-dimensional space and that any indices
that fall outside the bounding box are mapped to the locale which owns the
closest indices. In your case, since all of the indices you're adding are
above the bounding box, they're ending up being owned by the last locale.
This is illustrated in slide 72 of the following presentation better than
I can say in words here:
https://chapel-lang.org/presentations/ChapelForNWCPP2018-presented.pdf
I am not necessarily looking for the most efficient solution at this
point but rather for something simple using the basic language
mechanisms in Chapel.
Hope this description helps! Let us know if it raises additional
questions.
-Brad
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-users mailing list
Chapel-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-users