On Tue, Sep 16, 2014 at 8:28 AM, Pierre Jolivet <[email protected]> wrote:
> On 2014-09-15 11:40, Matthew Knepley wrote: > >> On Sun, Sep 14, 2014 at 8:36 PM, Barry Smith <[email protected]> >> wrote: >> >> Pierre, >>> >>> Thanks for reporting this, it is, indeed our bug. In >>> petsclog.h we have macros for the various MPI calls in order to log >>> their usage, for example, >>> >>> #define >>> >>> MPI_Scatter(sendbuf,sendcount,sendtype,recvbuf,recvcount, >> recvtype,root,comm) >> >>> >>> ((petsc_scatter_ct++,0) || >>> PetscMPITypeSize(&petsc_recv_len,recvcount,recvtype) || >>> >>> MPI_Scatter(sendbuf,sendcount,sendtype,recvbuf,recvcount, >> recvtype,root,comm)) >> >>> >>> but PetscMPITypeSize() simply called MPI_Type_size() which >>> generated an MPI error for MPI_DATATYPE_NULL >>> >>> PETSC_STATIC_INLINE PetscErrorCode PetscMPITypeSize(PetscLogDouble >>> *buff,PetscMPIInt count,MPI_Datatype type) >>> { >>> PetscMPIInt mysize; return (MPI_Type_size(type,&mysize) || >>> ((*buff += (PetscLogDouble) (count*mysize)),0)); >>> } >>> >> >> What error did you get? Why did I not get this error when I ran it? I >> ran with MPICH 3.0.4 since that was the one I had compiled for C++. >> > > (sorry for the late answer) > $ mpicxx in-place.cpp -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include > -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc -DPETSC_BUG > $ mpirun -np 4 ./a.out > Fatal error in PMPI_Type_size: Invalid datatype, error stack: > PMPI_Type_size(117): MPI_Type_size(MPI_DATATYPE_NULL) failed > PMPI_Type_size(67).: Datatype for argument datatype is a null datatype > $ mpicxx -show > g++ -I/opt/mpich-3.1/build/include -L/opt/mpich-3.1/build/lib -lmpichcxx > -Wl,-rpath -Wl,/opt/mpich-3.1/build/lib -lmpich -lopa -lmpl -lrt -lpthread > Thanks for tracking this down. This seems to be a change from MPICH 3.0.4 to 3.1 Luckily, Barry found it too. Matt > Pierre > >> >> Matt >> >> In the branch barry/fix-usage-with-mpidatatypenull I have added a >>> check for this special case and avoid the MPI_Type_size() call. I >>> will put this branch into next and if all tests pass it will be >>> merged into maint and master and be in the next patch release. >>> >>> Thank you for reporting the problem. >>> >>> Barry >>> >>> Barry still thinks MPI 1.1 is the height of HPC computing :-( >>> >>> On Sep 14, 2014, at 4:16 PM, Pierre Jolivet >>> <[email protected]> wrote: >>> >>> Hello, >>>> Could you please explain to me why the following example is not >>>> >>> working properly when <petsc.h> (from master, with OpenMPI 1.8.1) is >>> included ? >>> >>>> >>>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>>> >>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib >>> -lpetsc >>> >>>> $ mpirun -np 2 ./a.out >>>> Done with the scatter ! >>>> 0 0 0 0 (this line should be filled with 0) >>>> 1 1 1 1 (this line should be filled with 1) >>>> Done with the gather ! >>>> >>>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>>> >>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib >>> -lpetsc -DPETSC_BUG >>> >>>> $ mpirun -np 2 ./a.out >>>> [:3367] *** An error occurred in MPI_Type_size >>>> [:3367] *** reported by process [4819779585,140733193388032] >>>> [:3367] *** on communicator MPI_COMM_WORLD >>>> [:3367] *** MPI_ERR_TYPE: invalid datatype >>>> [:3367] *** MPI_ERRORS_ARE_FATAL (processes in this communicator >>>> >>> will now abort, >>> >>>> [:3367] *** and potentially your MPI job) >>>> >>>> Thank you for looking, >>>> Pierre >>>> >>>> <in-place.cpp> >>>> >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
