Re: Changing distribution of an existing array

Brad Chamberlain Mon, 02 Apr 2018 16:18:39 -0700


Hi Marcin --

I am wondering if there is a way to change a mapping of a distributedarray.

Good question. Feel free to direct questions like this that might be ofgeneral interest to Chapel programmers to Stack Overflow if you use it(tends to be a more searchable, permanent place to look for answers tosuch questions).

High-level answer: This isn't quite what you were asking, but there isn'ta way to change the domain of an array nor the domain map of a domain(currently at least, nor a plan to support it in the future). Thesedecisions are effectively part of the static type of the variable and sowould be challenging to change. The rationale for this decision is thatthey have a big impact on the code generated for an array.

Getting closer to what you're asking: a given domain map can potentiallybe modified to affect the implementation of its doamins and their arrays,including their distributions -- if the domain map author has provided ameans to do so. Most of the rest of this mail will describe what ispossible within our current domain maps, but note that one could alwaysmodify the domain maps further, or write a new domain map, to supportthings that our domain maps don't happen to support currently.

Diving in a bit further, in the example you provide, the issue is that theBlock distribution doesn't (currently) have a way of redefining itsbounding box / distribution as you'd ideally like to:

var domC = {0..0} dmapped Block({0..0});

...

 domC = {0..i}; // dmapped Block({0..i}); // needed, or halt on assignment

So once the bounding box defining its distribution is set to {0..0},there's no way to update it. The reason for this is partially due tolaziness on the part of the domain map author (in this case, probably me),and partially fear of leading users to do something expensive withoutbeing aware of it. Specifically, if you think about the communicationrequired to naively change the bounding box from 0..0 to 0..1 to 0..2 to0..n one element at a time across a large number of locales, there'd belots of little ripples of data across the locale set with each change.

(That said, the Block distribution probably really _ought_ to define amethod that redefines the bounding box, regardless of the cost, where themethod's documentation could warn against using it too frequently forperformance reasons).

I understand why is this, but I am looking for a way to add values to adistributed array and then “rebalance” the distribution at some point.
So there are 2 questions I have:
1. Is there a way to do this with arrays at all without somehowcreating a whole new array?
 2.  Is there any better approach for this?

Answering both questions at once, my mind goes to using a differentdistribution that's more amenable to load balancing as new indices areadded to it. For example, rewriting your example to use the Cyclicdistribution:


  use CyclicDist;

  var domC = {0..0} dmapped Cyclic(startIdx=0);

the output shows up like this:

  Initially, C is: 0
  C(0) assigned, C is: 0. domC is {0..0}
  C(1) assigned, C is: 0 1. domC is {0..1}
  C(2) assigned, C is: 0 1 0. domC is {0..2}
  C(3) assigned, C is: 0 1 0 1. domC is {0..3}
  C(4) assigned, C is: 0 1 0 1 0. domC is {0..4}
  C(5) assigned, C is: 0 1 0 1 0 1. domC is {0..5}

So, by nature, Cyclic is better suited to distributing new indices acrossthe locales without any need to change the definition or type of thedomain map.

If your data structure / algorithm is one that benefits from havingconsecutive blocks of _logical_ indices mapped to the same locale, thenthe BlockCyclic distribution might be a nice choice in that it wouldcombine some of the index locality of the Block distribution with theability to grow arbitrarily without redefining the domain map, as in theCyclic distribution. But trying it within your program, I'm finding thatBlock-Cylic isn't mature enough to support your program yet, for lamereasons (of these three domain maps, it's the one that's seen the leastamount of development + optimization effort and apparently re-assigning adomain dmapped with BlockCyclic() has never been implemented). Feel freeto file a feature request against this if it it's of interest to you.But let me also emphasize that _index locality_, not spatial locality, isthe main difference between Cyclic and Block-Cyclic. Both will store thelocal elements of a 1D array in consecutive memory locations, so theyshould both have similar spatial locality if all you were doing wasiterating over all of the elements of an array (say).

If index locality doesn't matter to you at all, another option to explorewould be to use a distributed associative domain. This results in evenless spatial locality, and the load balancing would only be as good as itshash function. Distributed associative domains are a feature that havenever made it onto master, but have been "almost done" forever, so thiswould be another place that'd be fair to file a feature request if it'd beuseful/preferable to you than the 1D rectangular domains/arrays you'recurrently using.

Circling back around: even if Block() were to support a way to dynamicallychange its bounding box, I'm not convinced that it's what you'd want touse for your case given that you're potentially adding verticesincrementally, which isn't a cheap operation for a Block distribution.Note that even if we implemented a way to change the bounding box forBlock(), we'd likely create a whole new array under the covers, assignbetween the arrays and throw the original away -- so you'd get theconvenience of being able to change the bounding box, but wouldn't getanything like in-place reallocation and minimal shifting of data betweenlocales without a lot of work (i.e., one _could_ implement this witheffort, I'm just not convinced the benefit would be worth the effort, sowouldn't sign up for it myself).

With arrays with Block distribution, everything outside of the boundingbox seems to go to the last locale.

This isn't quite right. It's actually that the n-dimensional bounding boxspecifies a partitioning on the n-dimensional space and that any indicesthat fall outside the bounding box are mapped to the locale which owns theclosest indices. In your case, since all of the indices you're adding areabove the bounding box, they're ending up being owned by the last locale.

This is illustrated in slide 72 of the following presentation better thanI can say in words here:


https://chapel-lang.org/presentations/ChapelForNWCPP2018-presented.pdf

I am not necessarily looking for the most efficient solution at thispoint but rather for something simple using the basic languagemechanisms in Chapel.

Hope this description helps! Let us know if it raises additionalquestions.


-Brad

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Chapel-users mailing list
Chapel-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-users

Re: Changing distribution of an existing array

Reply via email to