On Sun, Sep 14, 2014 at 5:56 PM, Christophe Prud'homme < [email protected]> wrote:
> Hello > > Pierre and I found this problem occurring using openmpi 1.8.1 petsc 3.5.1 > (this was done on my side using homebrew on osx) > we haven't checked mpich. However it is disturbing that removing the > petsc.h header actually solve the problem. > > from the mpi standard, if we understand correctly, we shouldn't have to > specify the data type for MPI_IN_PLACE scatter/gather operations. To avoid > either a crash with, deadlock(scatter) or wrong results (gather) with > boost::mpi we need to specify the data type. > > In our code Feel++ we have added some tests to verify this behavior [1] > basically > - mpi alone OK > - boost::mpi alone OK (boost::mpi is just used as an alternative to > initialize mpi) > - mpi+petsc.h with MPI_DATATYPE_NULL in scatter crash > - mpi+petsc.h+proper datatype OK > - boost::mpi.h+petsc with MPI_DATATYPE_NULL hangs in scatter > - boost::mpi.h+petsc+proper datatype OK > > 1. > https://github.com/feelpp/feelpp/blob/develop/testsuite/feelcore/test_gatherscatter.cpp > 1) I think this is an OpenMPI bug 2) I think this because I cannot reproduce with MPICH, and it is not valgrind clean 3) OpenMPI has lots of bugs. If you guys can reproduce with MPICH, I will track it down and fix it. Matt > > Best regards > C. > > On Mon, Sep 15, 2014 at 12:37 AM, Matthew Knepley <[email protected]> > wrote: > >> On Sun, Sep 14, 2014 at 4:16 PM, Pierre Jolivet <[email protected]> >> wrote: >> >>> Hello, >>> Could you please explain to me why the following example is not working >>> properly when <petsc.h> (from master, with OpenMPI 1.8.1) is included ? >>> >>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc >>> $ mpirun -np 2 ./a.out >>> Done with the scatter ! >>> 0 0 0 0 (this line should be filled with 0) >>> 1 1 1 1 (this line should be filled with 1) >>> Done with the gather ! >>> >>> $ mpicxx in-place.cpp -I$PETSC_DIR/include >>> -I$PETSC_DIR/$PETSC_ARCH/include -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc >>> -DPETSC_BUG >>> $ mpirun -np 2 ./a.out >>> [:3367] *** An error occurred in MPI_Type_size >>> [:3367] *** reported by process [4819779585,140733193388032] >>> [:3367] *** on communicator MPI_COMM_WORLD >>> [:3367] *** MPI_ERR_TYPE: invalid datatype >>> [:3367] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will >>> now abort, >>> [:3367] *** and potentially your MPI job) >>> >> >> I just built this with MPICH and it runs fine: >> >> master:/PETSc3/petsc/petsc-pylith$ >> /PETSc3/petsc/petsc-pylith/arch-pylith-cxx-debug/bin/mpiexec -host >> localhost -n 2 >> /PETSc3/petsc/petsc-pylith/arch-pylith-cxx-debug/lib/in-place-obj/in-place >> Done with the scatter ! >> 0 0 0 0 (this line should be filled with 0) >> 1 1 1 1 (this line should be filled with 1) >> Done with the gather ! >> >> Will valgrind. >> >> Matt >> >> >>> Thank you for looking, >>> Pierre >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > Christophe Prud'homme > Feel++ Project Manager > Professor in Applied Mathematics > @ Université Joseph Fourier (Grenoble, France) > @ Université de Strasbourg (France) > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
