I'm not sure this is actually a bug, but the difference may surprise
users. It seems that the implementation of
MPI_Ireduce_scatter(MPI_IN_PLACE,...) (ab?)uses the recvbuf to compute
the intermediate reduction, while MPI_Reduce_scatter(MPI_IN_PLACE,...)
does not.
Look at the following code (setup to be run in up to 16 processes).
While MPI_Reduce_scatter() does not change the second and following
elements of recvbuf, the nonblocking variant do modify the second and
following entries in some ranks.
[dalcinl@kw2060 openmpi]$ cat ireduce_scatter.c
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[])
{
int i,size,rank;
int recvbuf[] = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
int rcounts[] = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (size > 16) MPI_Abort(MPI_COMM_WORLD,1);
#ifndef NBCOLL
#define NBCOLL 1
#endif
#if NBCOLL
{
MPI_Request request;
MPI_Ireduce_scatter(MPI_IN_PLACE, recvbuf, rcounts, MPI_INT,
MPI_SUM, MPI_COMM_WORLD, &request);
MPI_Wait(&request,MPI_STATUS_IGNORE);
}
#else
MPI_Reduce_scatter(MPI_IN_PLACE, recvbuf, rcounts, MPI_INT,
MPI_SUM, MPI_COMM_WORLD);
#endif
printf("[%d] rbuf[%d]=%2d expected:%2d\n", rank, 0, recvbuf[i], size);
for (i=1; i<size; i++) {
printf("[%d] rbuf[%d]=%2d expected:%2d\n", rank, i, recvbuf[i], 1);
}
MPI_Finalize();
return 0;
}
[dalcinl@kw2060 openmpi]$ mpicc -DNBCOLL=0 ireduce_scatter.c
[dalcinl@kw2060 openmpi]$ mpiexec -n 3 ./a.out | sort
[0] rbuf[0]= 3 expected: 3
[0] rbuf[1]= 1 expected: 1
[0] rbuf[2]= 1 expected: 1
[1] rbuf[0]= 3 expected: 3
[1] rbuf[1]= 1 expected: 1
[1] rbuf[2]= 1 expected: 1
[2] rbuf[0]= 3 expected: 3
[2] rbuf[1]= 1 expected: 1
[2] rbuf[2]= 1 expected: 1
[dalcinl@kw2060 openmpi]$ mpicc -DNBCOLL=1 ireduce_scatter.c
[dalcinl@kw2060 openmpi]$ mpiexec -n 3 ./a.out | sort
[0] rbuf[0]= 3 expected: 3
[0] rbuf[1]= 2 expected: 1
[0] rbuf[2]= 2 expected: 1
[1] rbuf[0]= 3 expected: 3
[1] rbuf[1]= 1 expected: 1
[1] rbuf[2]= 1 expected: 1
[2] rbuf[0]= 3 expected: 3
[2] rbuf[1]= 1 expected: 1
[2] rbuf[2]= 1 expected: 1
--
Lisandro Dalcin
---------------
CIMEC (UNL/CONICET)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1016)
Tel/Fax: +54-342-4511169