On further inspection, the code for MPI_Type_size in MPICH checks for MPI_DATATYPE_NULL. Is it possible you were using a configuration of MPICH that turned off the error checking?
Bill On Sep 15, 2014, at 1:11 PM, William Gropp <[email protected]> wrote: > Actually, MPICH is incorrect here. NULL objects are an error unless > specifically permitted. > > Bill > > On Sep 15, 2014, at 1:08 PM, Barry Smith <[email protected]> wrote: > >> >> Matt, >> >> I ran with OpenMPI and got exactly the error you’d expect and what they >> reported "An error occurred in MPI_Type_size”. A simple use of the debugger >> would reveal where it happened. I suspect that MPICH is more generous when >> you call MPI_Type_size() with a null type, perhaps it just gives a size of >> zero. >> >> I hunted around on the web and could not find a definitive statement of >> what MPI_Type_size() should do when passed an argument of a null datatype. >> >> >> Barry >> >> On Sep 15, 2014, at 4:40 AM, Matthew Knepley <[email protected]> wrote: >> >>> On Sun, Sep 14, 2014 at 8:36 PM, Barry Smith <[email protected]> wrote: >>> >>> Pierre, >>> >>> Thanks for reporting this, it is, indeed our bug. In petsclog.h we have >>> macros for the various MPI calls in order to log their usage, for example, >>> >>> #define >>> MPI_Scatter(sendbuf,sendcount,sendtype,recvbuf,recvcount,recvtype,root,comm) >>> \ >>> ((petsc_scatter_ct++,0) || >>> PetscMPITypeSize(&petsc_recv_len,recvcount,recvtype) || >>> MPI_Scatter(sendbuf,sendcount,sendtype,recvbuf,recvcount,recvtype,root,comm)) >>> >>> but PetscMPITypeSize() simply called MPI_Type_size() which generated an MPI >>> error for MPI_DATATYPE_NULL >>> >>> PETSC_STATIC_INLINE PetscErrorCode PetscMPITypeSize(PetscLogDouble >>> *buff,PetscMPIInt count,MPI_Datatype type) >>> { >>> PetscMPIInt mysize; return (MPI_Type_size(type,&mysize) || ((*buff += >>> (PetscLogDouble) (count*mysize)),0)); >>> } >>> >>> What error did you get? Why did I not get this error when I ran it? I ran >>> with MPICH 3.0.4 since that was the one I had compiled for C++. >>> >>> Matt >>> >>> In the branch barry/fix-usage-with-mpidatatypenull I have added a check for >>> this special case and avoid the MPI_Type_size() call. I will put this >>> branch into next and if all tests pass it will be merged into maint and >>> master and be in the next patch release. >>> >>> Thank you for reporting the problem. >>> >>> Barry >>> >>> Barry still thinks MPI 1.1 is the height of HPC computing :-( >>> >>> >>> >>> On Sep 14, 2014, at 4:16 PM, Pierre Jolivet <[email protected]> wrote: >>> >>>> Hello, >>>> Could you please explain to me why the following example is not working >>>> properly when <petsc.h> (from master, with OpenMPI 1.8.1) is included ? >>>> >>>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc >>>> $ mpirun -np 2 ./a.out >>>> Done with the scatter ! >>>> 0 0 0 0 (this line should be filled with 0) >>>> 1 1 1 1 (this line should be filled with 1) >>>> Done with the gather ! >>>> >>>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc >>>> -DPETSC_BUG >>>> $ mpirun -np 2 ./a.out >>>> [:3367] *** An error occurred in MPI_Type_size >>>> [:3367] *** reported by process [4819779585,140733193388032] >>>> [:3367] *** on communicator MPI_COMM_WORLD >>>> [:3367] *** MPI_ERR_TYPE: invalid datatype >>>> [:3367] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now >>>> abort, >>>> [:3367] *** and potentially your MPI job) >>>> >>>> Thank you for looking, >>>> Pierre >>>> >>>> <in-place.cpp> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >> >
