Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

Gilles Gouaillardet Tue, 25 Nov 2014 01:12:08 -0500 (EST)

George,

imho, you are right !


here is attached a new version of Ghislain's program and that uses
MPI_Dist_graph_neighbors_count and MPI_Dist_graph_neighbors
as you suggested.

it produces correct results

/* note that in this case, realDestinations is similar to targets,
so i might have left some silent bugs in the program */

Bottom line, though Open MPI implementation of MPI_Dist_graph_create is not
deterministic, it is compliant with the MPI standard.
/* not to mention this is not the right place to argue what the standard
could or should have been ... */

Cheers,

Gilles


On 2014/11/24 12:47, George Bosilca wrote:
> I would argue this is a typical user level bug.
>
> The major difference between the dist_create and dist_create_adjacent is
> that in the later each process provides its neighbors in an order that is
> expected (and that match the info provided to the MPI_Neighbor_alltoallw
> call. When the topology is created with dist_create, every process will
> end-up having the correct partial topology, but in an order that doesn't
> match what the user expected (not in the rank-order of the neighbors).
> However, I can't find anything in the standard that would require from the
> MPI library to sort the neighbors. I would assume is the user
> responsibility, to make sure that they are using the topology in the right
> order, where the right order is what the communicator really contains and
> not what the user expect based on prior knowledge.
>
>   George.
>
>
> On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>>  Ghislain,
>>
>> i can confirm there is a bug in mca_topo_base_dist_graph_distribute
>>
>> FYI a proof of concept is available at
>> https://github.com/open-mpi/ompi/pull/283
>> and i recommend you use MPI_Dist_graph_create_adjacent if this meets your
>> needs.
>>
>> as a side note, the right way to set the info is
>>         MPI_Info info = MPI_INFO_NULL;
>>
>> /* mpich is more picky and crashes with info = NULL */
>>
>> Cheers,
>>
>> Gilles
>>
>>
>> On 2014/11/21 18:21, Ghislain Viguier wrote:
>>
>> Hi Gilles and Howard,
>>
>> The use of MPI_Dist_graph_create_adjacent solves the issue :)
>>
>> Thanks for your help!
>>
>> Best reagrds,
>> Ghislain
>>
>> 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet <gilles.gouaillar...@iferc.org
>>
>>  :
>>
>>    Hi Ghislain,
>>
>> that sound like a but in MPI_Dist_graph_create :-(
>>
>> you can use MPI_Dist_graph_create_adjacent instead :
>>
>> MPI_Dist_graph_create_
>> adjacent(MPI_COMM_WORLD, degrees, &targets[0],
>> &weights[0],
>>                         degrees, &targets[0], &weights[0], info,
>> rankReordering, &commGraph);
>>
>> it does not crash and as far as i understand, it produces correct results,
>>
>> according the the mpi standard (example 7.3) that should do the same
>> thing, that's why
>> i think there is a bug in MPI_Dist_graph_create
>>
>> Cheers,
>>
>> Gilles
>>
>>
>>
>>
>> On 2014/11/21 2:21, Howard Pritchard wrote:
>>
>> Hi Ghislain,
>>
>> I tried to run your test with mvapich 1.9 and get a "message truncated"
>> failure at three ranks.
>>
>> Howard
>>
>>
>> 2014-11-20 8:51 GMT-07:00 Ghislain Viguier <ghislain.vigu...@gmail.com> 
>> <ghislain.vigu...@gmail.com> <ghislain.vigu...@gmail.com> 
>> <ghislain.vigu...@gmail.com>:
>>
>>
>>  Dear support,
>>
>> I'm encountering an issue with the MPI_Neighbor_alltoallw request of
>> mpi-1.8.3.
>> I have enclosed a test case with information of my workstation.
>>
>> In this test, I define a weighted topology for 5 processes, where the
>> weight represent the number of buffers to send/receive :
>>     rank
>>       0 : | x |
>>       1 : | 2 | x |
>>       2 : | 1 | 1 | x |
>>       3 : | 3 | 2 | 3 | x |
>>       4 : | 5 | 2 | 2 | 2 | x |
>>
>> In this topology, the rank 1 will send/receive :
>>    2 buffers to/from the rank 0,
>>    1 buffer to/from the rank 2,
>>    2 buffers to/from the rank 3,
>>    2 buffers to/from the rank 4,
>>
>> The send buffer are defined with the MPI_Type_create_hindexed_block. This
>> allows to use a same buffer for several communications without duplicating
>> it (read only).
>> Here the rank 1 will have 2 send buffers (the max of 2, 1, 2, 2).
>> The receiver buffer is a contiguous buffer defined with
>> MPI_Type_contiguous request.
>> Here, the receiver buffer of the rank 1 is of size : 7 (2+1+2+2)
>>
>> This test case succesful for 2 or 3 processes. For 4 processes, the test
>> fails 1 times for 3 successes. For 5 processes, the test fails all the time.
>>
>> The error code is : *** MPI_ERR_IN_STATUS: error code in status
>>
>> I don't understand what I am doing wrong.
>>
>> Could you please have a look on it?
>>
>> Thank you very much.
>>
>> Best regards,
>> Ghislain Viguier
>>
>> --
>> Ghislain Viguier
>> Tél. 06 31 95 03 17
>>
>> _______________________________________________
>> users mailing listus...@open-mpi.org
>>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this 
>> post:http://www.open-mpi.org/community/lists/users/2014/11/25850.php
>>
>>
>>
>> _______________________________________________
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25852.php
>>
>>
>>
>> _______________________________________________
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this 
>> post:http://www.open-mpi.org/community/lists/users/2014/11/25853.php
>>
>>
>>
>>
>> _______________________________________________
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25855.php
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25856.php
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25858.php

//============================================================================
// Name        : 027_MPI_Neighbor_alltoallw_synthetic.cpp
// Author      : 
// Version     :
// Copyright   : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================

#include <mpi.h>
#include <stddef.h>
#include <algorithm>
#include <cassert>
#include <iostream>
#include <iterator>
#include <vector>

using namespace std;

static void reorder (int * ranks, int rank) {
	vector<int> tmp(4);
	for (size_t i = 0 ; i < 4 ; i++) {
		assert(ranks[i] != rank);
		if (ranks[i] > rank) {
			ranks[i]--;
		}
		tmp[i] = ranks[i];
	}
	for (size_t i = 0 ; i < 4 ; i++) {
		for (size_t j=0 ; j < 4 ; j++) {
			if (tmp[i] == j) {
				ranks[j] = i;
			}
		}
	}
}

int main(int argc, char *argv[]) {

	const int sendBufferSize = 1;

	///////////////   MPI initialization   ///////////////

	int ierr;
	int nbProc;
	int rank;
	int indegrees, outdegrees, weighted;
	ierr = MPI_Init(&argc, &argv);
	assert(!ierr);
	ierr = MPI_Comm_size(MPI_COMM_WORLD, &nbProc);
	assert(!ierr);
	ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
	assert(!ierr);

	assert(nbProc <= 5);

	///////////////   weighted topology   ///////////////
	//   0  | x |
	//   1  | 2 | x |
	//   2  | 1 | 1 | x |
	//   3  | 3 | 2 | 3 | x |
	//   4  | 5 | 2 | 2 | 2 | x |
	// rank   0   1   2   3   4

	int degrees = nbProc - 1;
	vector<int> targets(4);
	vector<int> weights(4);
	switch (rank) {

		case 0:
		targets[0] = 1; targets[1] = 2; targets[2] = 3; targets[3] = 4;
		weights[0] = 2; weights[1] = 1; weights[2] = 3; weights[3] = 5;
		break;

		case 1:
		targets[0] = 0; targets[1] = 2; targets[2] = 3; targets[3] = 4;
		weights[0] = 2; weights[1] = 1; weights[2] = 2; weights[3] = 2;
		break;

		case 2:
		targets[0] = 0; targets[1] = 1; targets[2] = 3; targets[3] = 4;
		weights[0] = 1; weights[1] = 1; weights[2] = 3; weights[3] = 2;
		break;

		case 3:
		targets[0] = 0; targets[1] = 1; targets[2] = 2; targets[3] = 4;
		weights[0] = 3; weights[1] = 2; weights[2] = 3; weights[3] = 2;
		break;

		case 4:
		targets[0] = 0; targets[1] = 1; targets[2] = 2; targets[3] = 3;
		weights[0] = 5; weights[1] = 2; weights[2] = 2; weights[3] = 2;
		break;

	default:
		break;
	}

	int nbSendBuffers = *max_element(weights.begin(),
			weights.begin() + degrees);

	MPI_Info info = MPI_INFO_NULL;
	int rankReordering = 0;
	MPI_Comm commGraph;
	int n = 1;
	ierr = MPI_Dist_graph_create(MPI_COMM_WORLD, n, &rank, &degrees,
			&targets[0], &weights[0], info, rankReordering, &commGraph);
	assert(!ierr);
	ierr = MPI_Dist_graph_neighbors_count(commGraph, &indegrees, &outdegrees, &weighted);
	assert(!ierr);
	assert(weighted);
	vector<int> sources(4);
	vector<int> sourceWeights(4);
	vector<int> destinations(4);
	vector<int> destinationWeights(4);
	ierr = MPI_Dist_graph_neighbors(commGraph, 4, &sources[0], &sourceWeights[0], 4, &destinations[0], &destinationWeights[0]);
	assert(!ierr);
	reorder(&sources[0], rank);
	reorder(&destinations[0], rank);

	int graphRank;
	ierr = MPI_Comm_rank(commGraph, &graphRank);
	assert(!ierr);

	///////////////   send rcv buffers   ///////////////

	vector<int *> sendBuffers(nbSendBuffers);
	cout << "rank " << rank << ", send buffers : ";
	for (int i = 0; i < nbSendBuffers; ++i) {
		sendBuffers[i] = new int[sendBufferSize];
		for (int j = 0; j < sendBufferSize; ++j) {
			sendBuffers[i][j] = rank;
			cout << sendBuffers[i][j] << " ";
		}
	}
	cout << endl << flush;

	MPI_Datatype mpiSendBufferType;
	ierr = MPI_Type_contiguous(sendBufferSize, MPI_INT, &mpiSendBufferType);
	assert(!ierr);
	ierr = MPI_Type_commit(&mpiSendBufferType);
	assert(!ierr);

	vector<int> mpiSendCounts(degrees, 1);
	vector<MPI_Aint> mpiSendDisplacements(degrees, 0);
	vector<MPI_Datatype> mpiSendBufferTypes(degrees);
	vector<MPI_Aint> mpiRcvBufferDispl(degrees);
	vector<MPI_Datatype> mpiRcvBufferTypes(degrees, mpiSendBufferType);

	MPI_Aint mpiAdressBase;
	ierr = MPI_Get_address(sendBuffers[0], &mpiAdressBase);
	assert(!ierr);

	int nbRcvBuffers = 0;
	for (size_t i = 0; i < degrees; ++i) {
		const int weight = destinationWeights[i];
		mpiRcvBufferDispl[sources[i]] = nbRcvBuffers * sendBufferSize * sizeof(int);
		vector<MPI_Aint> arrayOfDisplacements(weight);
		for (int j = 0; j < weight; ++j) {
			MPI_Aint mpiAdress;
			ierr = MPI_Get_address(sendBuffers[j], &mpiAdress);
			assert(!ierr);
			arrayOfDisplacements[j] = mpiAdress - mpiAdressBase;
		}

		ierr = MPI_Type_create_hindexed_block(weight, 1,
				&arrayOfDisplacements[0], mpiSendBufferType,
				&mpiSendBufferTypes[i]);
		assert(!ierr);
		ierr = MPI_Type_commit(&mpiSendBufferTypes[i]);
		assert(!ierr);

		nbRcvBuffers += weight;
	}

	vector<int> rcvBuffer(nbRcvBuffers * sendBufferSize, -999);

	///////////////   send rcv buffers   ///////////////

	ierr = MPI_Neighbor_alltoallw(sendBuffers[0], &mpiSendCounts[0],
			&mpiSendDisplacements[0], &mpiSendBufferTypes[0], &rcvBuffer[0],
			&sourceWeights[0], &mpiRcvBufferDispl[0], &mpiRcvBufferTypes[0],
			commGraph);
	assert(!ierr);

	///////////////   printing rcv buffer   ///////////////

	cout << "rank " << rank << ", rcv buffers : ";
	for (size_t i = 0; i < rcvBuffer.size(); ++i)
		cout << rcvBuffer[i] << " ";
	cout << endl;

	///////////////   freeing   ///////////////

	for (int i = 0; i < nbSendBuffers; ++i)
		delete[] sendBuffers[i];

	for (int i = 0; i < degrees; ++i) {
		ierr = MPI_Type_free(&mpiSendBufferTypes[i]);
		assert(!ierr);
	}

	ierr = MPI_Type_free(&mpiSendBufferType);
	assert(!ierr);

	ierr = MPI_Barrier(MPI_COMM_WORLD);
	assert(!ierr);
	ierr = MPI_Finalize();
	assert(!ierr);

}

Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

Reply via email to