Dear All,

Next feed back is about MPI_Gather problem.

Gather may be truncated in following condition:
1:ompi_coll_tuned_gather_intra_linear_sync is called.
(message size is over 6000B)

2:Either send data type or recv data type is derived type and
other data type is predefined data type.

Truncated is occurred by attached C file(following output).

Output:
*** An error occurred in MPI_Gather
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

In this C program,
"first_segment_count(variable in ompi_coll_tuned_gather_intra_linear_sync)" is 
different between root and non-root.
That makes messages truncated.
"first_segment_size" can not be dividable by derived data type's size,
but can dividable by predefined data type's size.
But we don't solve this problem.
So, we don't choose linear_sync in coll_tuned_decision_fixed.c.

Best Regards,

Yuki MATSUMOTO
MPI development team,
Fujitsu

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "mpi.h"


int main (int argc, char **argv)
{
        int sbuf[30000];
        int rbuf[30000];
        int myproc, nprocs;
        MPI_Datatype * itype;

        int n = 751;
        MPI_Init(&argc, &argv);
        MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
        MPI_Comm_rank(MPI_COMM_WORLD, &myproc);

        if ( 0 == myproc)
        {
                printf ("msg size:%lu\n",2*n*sizeof(int));
        }
        
        MPI_Type_vector(2,n,2*n, MPI_INT,itype);
        MPI_Type_commit(itype);

        memset((void *)sbuf, myproc+1 , sizeof(int)*n);
        
        MPI_Gather(sbuf, 2*n, MPI_INT, rbuf,1,*itype, 0, MPI_COMM_WORLD);
        
        MPI_Finalize();
        return 0;
}

Reply via email to