Thanks for your response, George. Just confirming that this should be okay to use iteratively is a huge help.
After further investigation, this only seems to occur on my test workstation with the following … Open MPI repo revision: v4.0.2 Open MPI release date: Oct 07, 2019 Open RTE: 4.0.2 Configured architecture: x86_64-apple-darwin19.2.0 g++ --version Apple clang version 11.0.3 (clang-1103.0.32.29) Target: x86_64-apple-darwin19.2.0 Thread model: posix I am not currently able to duplicate the errors on an actual Linux cluster with OpenMPI 4.0.2. So, this is probably insignificant for most production use--but in case you are interested, from what I can tell, this code block should reproduce the error for OpenMPI\Clang... int main(int argc, const char * argv[]) { MPI_Init(NULL, NULL); int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); for(int run=1; run<=30; run++) { MPI_Comm topology; const int send[1] = { world_rank == world_size-1 ? 0 : world_rank+1 }; const int receive[1] = { world_rank > 0 ? world_rank-1 : world_size-1 }; const int degrees[1] = { 1 }; const int weights[1] = { 1 }; printf("rank %d send -> %d\r\n", world_rank, send[0]); printf("rank %d receive -> %d\r\n", world_rank, receive[0]); MPI_Comm oldcomm = MPI_COMM_WORLD; MPI_Dist_graph_create(oldcomm, 1, send, degrees, receive, weights, MPI_INFO_NULL, 1, &topology); } } Thanks, -Bradley On Apr 6, 2020, at 10:36 AM, George Bosilca <bosi...@icl.utk.edu<mailto:bosi...@icl.utk.edu>> wrote: Bradley, You call then through a blocking MPI function, the operation is therefore completed by the time you return from the MPI call. So, short story you should be safe calling the dost_graph_create in a loop. The segfault indicates a memory issue with some of the internals of the treematch. Do you have an example that reproduces this issue so that I can take a look and fix it ? Thanks, George. On Mon, Apr 6, 2020 at 11:31 AM Bradley Morgan via devel <devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote: Hello OMPI Developers and Community, I am interested in investigating dynamic runtime optimization of MPI topologies using an evolutionary approach. My initial testing is resulting in segfaults\sigabrts when I attempt to iteratively create a new communicator with reordering enabled, e.g… [88881] Signal: Segmentation fault: 11 (11) [88881] Signal code: Address not mapped (1) [88881] Failing at address: 0x0 [88881] [ 0] 0 libsystem_platform.dylib 0x00007fff69dff42d _sigtramp + 29 [88881] [ 1] 0 mpi_island_model_ea 0x0000000100000032 mpi_island_model_ea + 50 [88881] [ 2] 0 mca_topo_treematch.so 0x0000000105ddcbf9 free_list_child + 41 [88881] [ 3] 0 mca_topo_treematch.so 0x0000000105ddcbf9 free_list_child + 41 [88881] [ 4] 0 mca_topo_treematch.so 0x0000000105ddcd1f tm_free_tree + 47 [88881] [ 5] 0 mca_topo_treematch.so 0x0000000105dd6967 mca_topo_treematch_dist_graph_create + 9479 [88881] [ 6] 0 libmpi.40.dylib 0x00000001001992e0 MPI_Dist_graph_create + 640 [88881] [ 7] 0 mpi_island_model_ea 0x00000001000050c7 main + 1831 I see in some documentation where MPI_Dist_graph_create is not interrupt safe, which I interpret to mean it is not really designed for iterative use without some sort of safeguard to keep it from overlapping. I guess my question is, are the topology mapping functions really meant to be called in iteration, or are they meant for single use? If you guys think this is something that might be possible, do you have any suggestions for calling the topology mapping iteratively or any hints, docs, etc. on what else might be going wrong here? Thanks, Bradley