Btw, the proposed validator was incorrect the first printf instead of printf(ā[%d] rbuf[%d]=%2d expected:%2d\nā, rank, 0, recvbuf[i], size);
should be printf(ā[%d] rbuf[%d]=%2d expected:%2d\nā, rank, 0, recvbuf[0], size); George. On Apr 21, 2014, at 19:32 , George Bosilca <bosi...@icl.utk.edu> wrote: > r31473 should fix this issue. > > George. > > > On Apr 21, 2014, at 10:05 , Lisandro Dalcin <dalc...@gmail.com> wrote: > >> I'm not sure this is actually a bug, but the difference may surprise >> users. It seems that the implementation of >> MPI_Ireduce_scatter(MPI_IN_PLACE,...) (ab?)uses the recvbuf to compute >> the intermediate reduction, while MPI_Reduce_scatter(MPI_IN_PLACE,...) >> does not. >> >> Look at the following code (setup to be run in up to 16 processes). >> While MPI_Reduce_scatter() does not change the second and following >> elements of recvbuf, the nonblocking variant do modify the second and >> following entries in some ranks. >> >> >> [dalcinl@kw2060 openmpi]$ cat ireduce_scatter.c >> #include <stdlib.h> >> #include <stdio.h> >> #include <mpi.h> >> int main(int argc, char *argv[]) >> { >> int i,size,rank; >> int recvbuf[] = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; >> int rcounts[] = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; >> MPI_Init(&argc, &argv); >> MPI_Comm_size(MPI_COMM_WORLD, &size); >> MPI_Comm_rank(MPI_COMM_WORLD, &rank); >> if (size > 16) MPI_Abort(MPI_COMM_WORLD,1); >> #ifndef NBCOLL >> #define NBCOLL 1 >> #endif >> #if NBCOLL >> { >> MPI_Request request; >> MPI_Ireduce_scatter(MPI_IN_PLACE, recvbuf, rcounts, MPI_INT, >> MPI_SUM, MPI_COMM_WORLD, &request); >> MPI_Wait(&request,MPI_STATUS_IGNORE); >> } >> #else >> MPI_Reduce_scatter(MPI_IN_PLACE, recvbuf, rcounts, MPI_INT, >> MPI_SUM, MPI_COMM_WORLD); >> #endif >> printf("[%d] rbuf[%d]=%2d expected:%2d\n", rank, 0, recvbuf[i], size); >> for (i=1; i<size; i++) { >> printf("[%d] rbuf[%d]=%2d expected:%2d\n", rank, i, recvbuf[i], 1); >> } >> MPI_Finalize(); >> return 0; >> } >> >> [dalcinl@kw2060 openmpi]$ mpicc -DNBCOLL=0 ireduce_scatter.c >> [dalcinl@kw2060 openmpi]$ mpiexec -n 3 ./a.out | sort >> [0] rbuf[0]= 3 expected: 3 >> [0] rbuf[1]= 1 expected: 1 >> [0] rbuf[2]= 1 expected: 1 >> [1] rbuf[0]= 3 expected: 3 >> [1] rbuf[1]= 1 expected: 1 >> [1] rbuf[2]= 1 expected: 1 >> [2] rbuf[0]= 3 expected: 3 >> [2] rbuf[1]= 1 expected: 1 >> [2] rbuf[2]= 1 expected: 1 >> >> [dalcinl@kw2060 openmpi]$ mpicc -DNBCOLL=1 ireduce_scatter.c >> [dalcinl@kw2060 openmpi]$ mpiexec -n 3 ./a.out | sort >> [0] rbuf[0]= 3 expected: 3 >> [0] rbuf[1]= 2 expected: 1 >> [0] rbuf[2]= 2 expected: 1 >> [1] rbuf[0]= 3 expected: 3 >> [1] rbuf[1]= 1 expected: 1 >> [1] rbuf[2]= 1 expected: 1 >> [2] rbuf[0]= 3 expected: 3 >> [2] rbuf[1]= 1 expected: 1 >> [2] rbuf[2]= 1 expected: 1 >> >> >> >> >> >> -- >> Lisandro Dalcin >> --------------- >> CIMEC (UNL/CONICET) >> Predio CONICET-Santa Fe >> Colectora RN 168 Km 472, Paraje El Pozo >> 3000 Santa Fe, Argentina >> Tel: +54-342-4511594 (ext 1016) >> Tel/Fax: +54-342-4511169 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/04/14565.php >