Hi Marcin --

I am wondering if there is a way to change a mapping of a distributed array.

Good question. Feel free to direct questions like this that might be of general interest to Chapel programmers to Stack Overflow if you use it (tends to be a more searchable, permanent place to look for answers to such questions).

High-level answer: This isn't quite what you were asking, but there isn't a way to change the domain of an array nor the domain map of a domain (currently at least, nor a plan to support it in the future). These decisions are effectively part of the static type of the variable and so would be challenging to change. The rationale for this decision is that they have a big impact on the code generated for an array.

Getting closer to what you're asking: a given domain map can potentially be modified to affect the implementation of its doamins and their arrays, including their distributions -- if the domain map author has provided a means to do so. Most of the rest of this mail will describe what is possible within our current domain maps, but note that one could always modify the domain maps further, or write a new domain map, to support things that our domain maps don't happen to support currently.

Diving in a bit further, in the example you provide, the issue is that the Block distribution doesn't (currently) have a way of redefining its bounding box / distribution as you'd ideally like to:

var domC = {0..0} dmapped Block({0..0});

...

 domC = {0..i}; // dmapped Block({0..i}); // needed, or halt on assignment

So once the bounding box defining its distribution is set to {0..0}, there's no way to update it. The reason for this is partially due to laziness on the part of the domain map author (in this case, probably me), and partially fear of leading users to do something expensive without being aware of it. Specifically, if you think about the communication required to naively change the bounding box from 0..0 to 0..1 to 0..2 to 0..n one element at a time across a large number of locales, there'd be lots of little ripples of data across the locale set with each change.

(That said, the Block distribution probably really _ought_ to define a method that redefines the bounding box, regardless of the cost, where the method's documentation could warn against using it too frequently for performance reasons).


I understand why is this, but I am looking for a way to add values to a distributed array and then “rebalance” the distribution at some point.
So there are 2 questions I have:


1. Is there a way to do this with arrays at all without somehow creating a whole new array?

 2.  Is there any better approach for this?

Answering both questions at once, my mind goes to using a different distribution that's more amenable to load balancing as new indices are added to it. For example, rewriting your example to use the Cyclic distribution:

  use CyclicDist;

  var domC = {0..0} dmapped Cyclic(startIdx=0);

the output shows up like this:

  Initially, C is: 0
  C(0) assigned, C is: 0. domC is {0..0}
  C(1) assigned, C is: 0 1. domC is {0..1}
  C(2) assigned, C is: 0 1 0. domC is {0..2}
  C(3) assigned, C is: 0 1 0 1. domC is {0..3}
  C(4) assigned, C is: 0 1 0 1 0. domC is {0..4}
  C(5) assigned, C is: 0 1 0 1 0 1. domC is {0..5}

So, by nature, Cyclic is better suited to distributing new indices across the locales without any need to change the definition or type of the domain map.

If your data structure / algorithm is one that benefits from having consecutive blocks of _logical_ indices mapped to the same locale, then the BlockCyclic distribution might be a nice choice in that it would combine some of the index locality of the Block distribution with the ability to grow arbitrarily without redefining the domain map, as in the Cyclic distribution. But trying it within your program, I'm finding that Block-Cylic isn't mature enough to support your program yet, for lame reasons (of these three domain maps, it's the one that's seen the least amount of development + optimization effort and apparently re-assigning a domain dmapped with BlockCyclic() has never been implemented). Feel free to file a feature request against this if it it's of interest to you. But let me also emphasize that _index locality_, not spatial locality, is the main difference between Cyclic and Block-Cyclic. Both will store the local elements of a 1D array in consecutive memory locations, so they should both have similar spatial locality if all you were doing was iterating over all of the elements of an array (say).

If index locality doesn't matter to you at all, another option to explore would be to use a distributed associative domain. This results in even less spatial locality, and the load balancing would only be as good as its hash function. Distributed associative domains are a feature that have never made it onto master, but have been "almost done" forever, so this would be another place that'd be fair to file a feature request if it'd be useful/preferable to you than the 1D rectangular domains/arrays you're currently using.

Circling back around: even if Block() were to support a way to dynamically change its bounding box, I'm not convinced that it's what you'd want to use for your case given that you're potentially adding vertices incrementally, which isn't a cheap operation for a Block distribution. Note that even if we implemented a way to change the bounding box for Block(), we'd likely create a whole new array under the covers, assign between the arrays and throw the original away -- so you'd get the convenience of being able to change the bounding box, but wouldn't get anything like in-place reallocation and minimal shifting of data between locales without a lot of work (i.e., one _could_ implement this with effort, I'm just not convinced the benefit would be worth the effort, so wouldn't sign up for it myself).


With arrays with Block distribution, everything outside of the bounding box seems to go to the last locale.

This isn't quite right. It's actually that the n-dimensional bounding box specifies a partitioning on the n-dimensional space and that any indices that fall outside the bounding box are mapped to the locale which owns the closest indices. In your case, since all of the indices you're adding are above the bounding box, they're ending up being owned by the last locale.

This is illustrated in slide 72 of the following presentation better than I can say in words here:

https://chapel-lang.org/presentations/ChapelForNWCPP2018-presented.pdf


I am not necessarily looking for the most efficient solution at this point but rather for something simple using the basic language mechanisms in Chapel.

Hope this description helps! Let us know if it raises additional questions.

-Brad
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-users mailing list
Chapel-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to